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A  SEQUENTIAL  QUADRATIC  PROGRAMMING  ALGORITHM  USING 
AN  INCOMPLETE  SOLUTION  OF  THE  SUBPROBLEM 


Walter  Murray*  and  Francisco  J.  Prieto^ 

‘Systems  Optimization  Laboratory 
Department  of  Operations  Research 
Stanford  University 

*Dept.  de  Estadistica  y  Econometria 
Universidad  Carlos  III  de  Madrid 


Abstract 

We  analyze  sequential  quadratic  programming  (SQP)  methods  to  solve  non¬ 
linear  constrained  optimization  problems  that  are  more  flexible  in  their  def¬ 
inition  than  standard  SQP  methods.  The  type  of  flexibility  introduced  is 
motivated  by  the  necessity  to  deviate  from  the  standard  approach  when 
solving  large  problems.  Specifically  we  no  longer  require  a  minimizer  of 
the  QP  subproblem  to  be  determined  or  particular  Lagrange  multiplier  es¬ 
timates  to  be  used.  Our  main  focus  is  on  an  SQP  algorithm  that  uses  a 
particular  augmented  Lagrangian  merit  function.  New  results  are  derived 
for  this  algorithm  under  weaker  conditions  than  previously  assumed;  in 
particular,  it  is  not  assumed  that  the  iterates  lie  on  a  compact  set. 

1.  Introduction 

The  problem  of  interest  is  the  following: 

NP 

where  F  :  — »  3?  and  c  :  3?"  — >  Since  we  shall  not  assume  second  derivatives  are 

known,  computing  x*,  a  point  satisfying  the  first-order  KKT  conditions  for  NP  is  the  best 
that  can  be  achieved.  Such  points  are  feasible  and  satisfy  the  following  conditions: 

VF(x*)  =  Vc{ x*)T\*,  X*Cj(x*)  =  0  j  =  1, . . . , m  (1.1) 

for  some  nonnegative  multiplier  vector  X*  €  S?TO.  Whenever  the  term  “KKT  point”  is 
used  in  the  following  sections,  what  will  be  meant  is  a  point  satisfying  the  first-order  KKT 

’Research  supported  by  the  National  Science  Foundation  Grant  DDM-9204208;  the  Department  of  Energy 
Grant  DE-FG03-92ER25117;  the  Office  of  Naval  Research  Grant  N00014-90  J-1242  and  the  NATO  travel 
grant  No.  5n0525 
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conditions  for  NP.  Despite  this  theoretical  limitation  we  shall  prefer  some  KKT  points  to 
others  in  order  to  try  and  satisfy  our  real  purpose  of  finding  a  minimizer.  For  example,  if 
the  initial  estimate  is  feasible  we  do  not  wish  to  converge  to  a  nearby  KKT  point  if  at  that 
point  the  objective  function  is  higher. 

We  use  the  term  stationary  point  to  denote  a  point  that  is  feasible  and  satisfies  (1.1)  for 
some  multiplier  vector  A  €  that  is  not  necessarily  nonnegative. 

Typically  SQP  algorithms  generate  a  sequence  of  points  {x*}  converging  to  a  solution, 
by  solving  at  each  point,  Xk,  a  quadratic  program  (QP)  of  the  form 


for  some  positive  definite  matrix  Hk-  Let  (referred  to  as  the  search  direction)  denote  the 
unique  solution  to  QP.  We  define  x^+i  =  Xk  +  OfcPfc,  where  the  steplength  a *  is  chosen  to 
achieve  a  reduction  in  a  merit  function. 

SQP  algorithms  are  viewed  by  many  as  the  best  approach  to  the  solution  of  NP  when  n 
is  small  (  <  200  ).  As  the  size  of  the  problem  grows,  usually  so  does  the  relative  importance 
of  the  effort  to  solve  QP  when  compared  to  the  total  effort.  Indeed  for  many  large  problems 
the  effort  to  solve  QP  dominates  the  total  effort. 

When  the  minimizer  of  QP  is  used  to  define  the  search  direction,  it  is  not  necessary  in  any 
theoretical  discussion  of  an  SQP  algorithm  to  define  how  the  QP  subproblem  is  solved.  All 
implementations  of  SQP  methods  currently  available  use  an  active-set  method  to  solve  the 
QP  subproblem.  For  a  comprehensive  survey  of  active-set  methods  see  [GMW81],  [Fle87] 
and  [GMSW91].  The  potential  number  of  iterations  to  solve  a  QP  using  an  active-set 
method  grows  exponentially  with  n.  In  practice  the  number  of  iterations  grows  much  more 
slowly  than  exponential  (if  this  was  not  the  case  active-set  methods  would  be  hopelessly 
inefficient).  Nonetheless,  the  number  of  iterations  required  to  solve  a  large  QP  is  usually 
large.  In  any  implementation  of  an  SQP  method  it  is  necessary  to  limit  the  number  of 
iterations  allowed  to  solve  a  given  QP  subproblem.  If  the  QP  solution  process  is  terminated 
prematurely  the  SQP  algorithm  may  break  down.  It  is  in  part  for  this  reason  that  the 
development  of  SQP  methods  for  large-scale  problems  has  been  inhibited.  Even  for  small 
problems  there  are  occasions  when  the  number  of  QP  iterations  is  excessive.  Since  the 
definition  of  “small”  continues  to  increase  as  computers  become  more  powerful  we  can 
expect  the  cost  of  solving  the  subproblems  to  grow  in  importance. 

In  the  algorithms  presented  here  we  have  endeavored  to  improve  the  efficiency  of  SQP 
methods  by  circumventing  the  need  to  determine  the  minimizer  of  QP.  We  show  that  a 
suitable  search  direction  may  be  computed  from  information  available  at  any  stationary 
point  of  QP.  Stationary  points  occur  as  iterates  within  most  active-set  methods  to  solve  QP 
and  for  such  methods  the  number  of  iterations  to  determine  a  stationary  point  increases 
only  linearly  with  the  size  of  the  problem.  Consequently,  the  search  direction  may  be  found 
by  applying  an  active-set  method  to  QP  and  terminating  the  procedure  early. 

It  may  be  thought  that  by  expending  much  less  efibn  to  compute  the  search  direction, 
the  number  of  iterations  for  the  outer  algorithm  may  increase.  However,  it  has  been  observed 
that  large  numbers  of  QP  iterations  are  required  only  when  x*  is  a  poor  approximation  to 
x*,  that  is,  when  the  QP  subproblem  does  not  model  the  nonlinear  problem  well.  We 
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hypothesize  that  a  search  direction  based  on  the  minimizer  of  such  subproblems  is  little 
better  than  using  information  at  a  stationary  point.  Our  preliminary  results  reported  in 
Section  6  support  this  hypothesis. 

Not  solving  the  QP  subproblem  also  implies  that  we  do  not  know  the  QP  multipliers, 
which  are  often  used  to  estimate  the  multipliers  of  NP.  In  general,  SQP  methods  usually 
use  some  specific  estimate  of  the  NP  multipliers  in  the  definition  of  the  method  and  hence 
in  the  proof  of  convergence.  When  solving  large  problems  specific  definitions  of  multiplier 
estimates  are  not  always  computationally  attractive.  In  our  analysis  we  allow  for  flexibility 
in  how  multipliers  are  defined  by  requiring  only  that  the  multiplier  estimates  satisfy  certain 
conditions. 


Incomplete  solutions  for  QP  subproblems 

There  have  been  other  proposals  to  define  the  search  direction  for  an  SQP  algorithm  other 
than  as  the  minimizer  of  the  QP  subproblem.  In  Dembo  and  Tulowitzki  [DT85]  an  algorithm 
is  analyzed  for  which  the  search  direction  pk  has  the  property  that 

Up*  -  Pfcll  =  o(IMI), 

where  p*k  denotes  the  minimizer  for  the  k- th  QP  subproblem. 

We  follow  a  different  approach  and  define  a  search  direction  for  which  the  effort  to 
compute  it  has  a  guaranteed  bound.  A  different  algorithm,  but  using  the  same  approach, 
was  suggested  by  Gurwitz  and  Overton  [G089].  However,  no  global  convergence  results 
were  given  for  their  algorithm. 

In  the  course  of  solving  a  QP  an  active-set  method  generates  iterates  that  are  stationary 
points.  We  show  that  such  points  may  be  used  to  construct  a  suitable  search  direction.  The 
step  to  the  stationary  point  is  not  in  general  an  adequate  search  direction.  However,  if 
the  stationary  point  is  not  a  minimizer  then  there  exist  nonoptimal  multipliers.  We  show 
how  an  auxiliary  direction  may  be  constructed  using  information  about  the  nonoptimal 
multipliers.  This  auxiliary  direction,  when  combined  with  the  step  to  the  stationary  point, 
gives  a  suitable  search  direction. 

Terminating  the  QP  algorithm  prior  to  obtaining  a  solution  impacts  the  SQP  algorithm 
in  a  number  of  critical  ways.  Not  only  is  the  search  direction  different,  but  also  the  QP 
multipliers  will  not  be  available.  The  merit  function  of  principal  interest  requires  the  defi¬ 
nition  of  a  search  direction  in  the  space  of  the  multipliers.  In  the  past,  this  search  direction 
has  been  defined  using  the  QP  multipliers.  The  fact  that  such  multipliers  are  positive  was 
crucial  in  the  analysis  of  these  algorithms.  The  consequences  of  terminating  the  QP  solution 
process  early  are  therefore  far  reaching. 

The  remainder  of  this  paper  is  organized  as  follows.  Section  2  describes  the  form  of 
tl.e  general  algorithm,  and  the  definition  of  the  search  direction.  Section  3  studies  the 
convergence  properties  of  the  algorithm;  it.  is  *bovTi  that  such  an  algo, it lun  is  globally 
convergent.  In  Section  4  we  show  that  the  algorithm  converges  superlinearly.  We  also  show 
that  the  penalty  parameter  used  in  the  merit  function  is  bounded.  Section  5  considers  the 
use  of  alternative  merit  functions.  Finally,  Section  6  presents  numerical  results  obtained 
from  an  implementation  that  uses  the  merit  function  of  principal  interest. 


2.  Description  of  the  algorithm 

The  search  direction  we  propose  could  be  used  with  most  of  the  merit  functions  analyzed 
in  the  literature.  However,  our  primary  interest  is  the  following  merit  function: 

La(x,  A, »,  p)  =  F(x)  -  A T(c(x)  -s)  +  \p(c{x)  -  s)T(c(x)  -  s)  ,  (2.1) 

where  s  >  0  are  slack  variables,  and  the  scalar  p  is  known  as  the  penalty  parameter. 

This  merit  function  was  suggested  by  Gill  et  al.  [GMSW86b]  and  is  used  in  the  SQP 
code  NPSOL.  It  is  similar  to  merit  functions  proposed  by  Wright  [Wri76]  and  Schittkowski 
[Sch81].  Although  our  primary  interest  is  this  specific  merit  function,  we  also  show  (Section 
5)  how  the  ideas  discussed  can  be  extended  to  the  use  of  other  merit  functions.  The  reason 
for  our  focus  on  this  merit  function  is  due  to  the  success  in  practice  of  NPSOL.  The  merit 
function  is  also  used  in  a  new  SQP  code,  LSSQP  [Eld91],  designed  to  solve  large  problems. 

The  search  is  performed  on  an  expanded  space,  including  the  Lagrange  multiplier  es¬ 
timates  A,  and  the  slack  variables  s.  The  symbols  p,  £  and  q  will  be  used  to  denote  the 
components  of  the  search  direction  on  the  corresponding  subspaces.  In  this  case,  the  value 
of  the  merit  function  as  a  function  of  the  steplength  will  be  denoted  by 

<f>(a)  =  La(x  +  ap,  A  +  af,  s  +  aq,  p).  (2.2) 

The  derivative  of  <f>  with  Tespect  to  a  is  denoted  by  <j>' .  Also,  <f>j t(a)  and  <f>k(a)  will  be  used 
to  indicate  the  values  of  <f>  and  <f>'  evaluated  at  (xk,pk,\k,£k,sk,qk,pk). 

The  following  conventions  will  be  used  in  the  rest  of  the  paper: 

9k  =  VF(xm),  Ak  =  Vc(xfc),  ck  =  c{xk), 

and  the  symbols  Ak  and  ck  will  be  used  with  the  same  meaning  as  Ak  and  ck,  but  restricted 
to  the  set  of  active  constraints  at  the  given  point.  The  term  active  constraint  will  be  used 
to  designate  a  constraint  that  is  satisfied  exactly  at  the  current  point  ( Cj(x )  =  0  in  NP,  or 
ajp  =  —  cj  in  QP),  and  the  set  of  all  constraints  active  at  a  given  point  will  be  referred  to 
as  the  active  set  at  the  point. 

The  objective  function  for  the  QP  subproblem  will  be  denoted  by  ipk(p), 

Mp)  =  9kP  +  \pTHkp.  (2.3) 

Sometimes,  ip  will  denote  the  function  of  one  variable  ipk( 7)  =  ipk(p+  id). 

For  any  vector  v,  the  notation  v~  will  be  used  to  denote  the  vector  whose  j-th  element 
is  defined  as 

v~  =  -  min(0,  v,). 

Finally,  the  symbol  e  denotes  the  vector  (1, . . .,  l)r,  and  symbols  of  the  form  f3abc  denote 
fixed  scalars  related  to  properties  of  the  problem,  or  the  implementation  of  the  algorithm, 
where  “a6c”  identifies  the  specific  scalar  represented. 
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The  algorithm 

We  first  present  an  outline  of  the  algorithm.  Given  H o  positive  definite,  xo  and  Ao,  select 
P-i  >  0,  0  <  a  <  T)  <  1,  /3C  >  |jc-(x0)||oo,  /?/i  >  ||Ao||  and  f3p  >  0. 

Algorithm  ETSQP 

k  -  0 
repeat 

Obtain  the  search  direction  pk  from  the  QP  subproblem 

minp  V>*(p)  =  g\p  +  \pT  Hkp 
s.t.  AkP  +  C/t  >  0 

Compute  pk,  an  estimate  of  A*  such  that  |]pfc||  < 
ik^Pk-  ^k 
if  pk- 1  =  0 

Compute  Sk  from  (sk)j  =  max(0,(cjt)j) 

else 

Compute  sk  from  (sfc)j  =  max(0,(cA;)j  -  (Xk)j/pk-i) 

end  if 

qk  <-  Akpk  +  cfc  -  sk 
if  44(0)  <  -\plHkPk 
Pk  *-  Pk- 1 

else 

MPk )  +  (2A k  -  Pk)T(ck  ~sk)  „\ 
n  -  max(2p*.I, - - -ft) 

end  if 

if  <  <^fc(0)  +  <r^(0) 
a  <—  1 

else 

Select  a  €E  (0, 1)  to  satisfy 

<M«)  <  M o)  +  CTd^(0),  \4>'k(6t)\  <  -#fc(°) 

end  if 

while  c(xfc  +  apk)  t  -(3ce  or  <f>k(a )  >  </>fc(0)  +  <ra<j>'k( 0)  do 
a  <—  a/2 

end  do 

ak  <-  a 

(£:)-UMs) 

Compute  gk+ 1,  Ak+ 1  and  c*+i 
Update  to  form  //*+! 
k  + —  ft  -f- 1 
until  convergence 


The  following  are  some  comments  on  the  steps  of  the  algorithm. 
•  At  each  point  Xk,  we  form  the  QP  subproblem 


minimize  gjp  +  | pTHkP  (2.4a) 

subject  to  AkP  >  -Cki  (2.4b) 

and  determine  a  stationary  point  for  QP,  that  is,  a  point  p*  satisfying 

+  HkPk  =  Akirk,  (2.5a) 

AkPk  +  ck>  0,  ifkiAkPk  +  ck)  =  0,  (2.5b) 

for  some  vector  7 r*  € 


From  information  available  at  the  stationary  point  we  construct  a  search  direction  pk 
and  pk  an  estimate  of  A* .  The  precise  conditions  that  pk  and  pk  need  to  satisfy  are 
given  later  in  this  section.  If  pk  =  0,  we  set  A k  =  Pk  and  terminate.  Otherwise,  we 
compute  the  search  direction  in  the  space  of  the  multiplier  estimates  as 


&  =  Pk  ~  Afc. 


(2.6) 


•  The  slack  variables  Sk  are  computed  from 


max(0,  (cjt)j) 

maxfo,  (cjt)j  - 

v  Pk- 1  ' 


if  pk- 1  =  0, 
otherwise. 


(2.7) 


These  values  minimize  the  merit  function  (2.1)  at  {xk,^k,Pk-i)  with  respect  to  the 
slack  variables. 

The  slack  variables  s*  appear  in  the  merit  function  (2.1)  as  part  of  the  term  Ck  -  Sk- 
From  (2.7),  this  term  takes  the  value 


{min  (0,  (ck)j) 

We  shall  require  the  following  inequality: 

Ikfcll  <  ||cfc-«fc||. 

To  simplify  the  notation  in  the  justification  of  this  result,  we  drop  the  subscript  k. 

If  Cj  -  Sj  =  Cj  then  clearly  | c,  -  Sj|  =  jc,j  >  jc“|. 

If  cj  -  Sj  /  Cj  and  Cj  >  0,  then  cj  =  0  <  | Cj  —  Sjj.  Otherwise,  c.j  -  Sj  /  Cj  and  Cj  <  0. 
From  (2.8)  we  get  Cj  -  Sj  <  Cj  <  0,  and  hence  | Cj  -  >  |cj|  >  \c~\.  We  have  shown 
\c~\  <  | Cj  -  Sj|  under  all  circumstances,  implying  (2.9). 


if  pk- 1  =  0, 
otherwise. 


(2.8) 


(2.9) 
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•  The  search  direction  in  the  space  of  the  slack  vari?Mes  qk  is  set  to  the  vector  of  slack 
variables  for  the  QP  subproblem,  i.e. 

qk  =  AkPk  +  Ck-  Sk-  (2.10) 

For  a  linear  constraint  this  choice  keeps  the  corresponding  slack  variable  at  its  opti¬ 
mum  value. 


•  The  penalty  parameter  will  not  be  modified  if  the  condition 

<f>'k(0)  <  -blHkPk,  (2.11) 


is  satisfied,  where  4>k{o)  is  defined  in  (2.2).  Otherwise,  we  define  the  penalty  parameter 
as 

Pk  =  max(2pA,_i,pjk,/3p),  (2.12) 

where  f3p  is  some  positive  constant, 


.  _  j>k(Pk)  +  (2Afe  -  p.k)T{ck  -  Sk) 

n  '  ll«  -  «ll2 


(2.13) 


and  ipk  was  defined  in  (2.3).  It  will  be  shown  that  the  definition  (2.12)  ensures  that 
(Pk,(k,qk)  is  a  sufficient  descent  direction  for  the  merit  function,  in  the  sense  that 
condition  (2.11)  holds  for  this  value  of  the  penalty  parameter. 

•  The  steplength  a*  >  0  is  computed  to  reduce  0t(a)  while  keeping  the  constraint 
violation  bounded.  The  termination  conditions  for  the  linesearch  are  as  follows: 


If 


&(1)-&(0)<M(0), 


set  a  =  1.  Otherwise,  find  an  a  (E  (0, 1)  such  that 


(2.14) 


4>k(oi)  -  <f>k(0)  <  (Ta<f>'k(0)  (2.15a) 

<&(«)  >  (2.15b) 

where  0  <  a  <  rf  <  1. 

If  the  condition 

c{xk  +  apk)  >  ~(3ce  (2.16) 

holds,  we  define  Qk  =  a;  otherwise  we  compute  a*  by  performing  a  backtracking 
linesearch  from  a  until  (2.15a)  and  (2.16)  are  both  satisfied.  It  will  be  shown  later 
that  such  a  steplength  always  exists,  and  that  Algorithm  ETSQP  is  well  defined.  This 
definition  of  the  steplength  ensures  that  c(xk)  >  -f3ce  for  all  k.  A  more  sophisticated 
algorithm  could  be  used  to  determine  Qk  when  (2.16)  does  not  hold.  However,  we 
anticipate  such  events  will  be  rare. 

•  Finally,  Xk  and  A*  are  updated  from 

(£:)-(2)+ ■*(")•  «’■»> 
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The  definition  of  the  search  direction 

At  each  iteration  of  ETSQP  an  inner  iteration  is  performed  to  determine  the  search  di¬ 
rection  by  solving  the  QP  subprobiem  (2.4)  using  an  active-set  method.  The  following  is 
an  outline  of  a  suitable  algorithm  to  determine  the  search  direction.  The  outer  iteration 
subscript  has  been  omitted,  and  the  subscript  i  refers  to  the  inner  iterations. 

We  assume  that  positive  constants  0P,  fo,  -ym  have  been  defined. 


Algorithm  SD 

Compute  po  satisfying: 

Apo  +  c  >  0,  llPol!  <  0P\\c~\\,  9TPo  <  PpWc'W 

Form  Aq,  the  active-set  matrix  at  po,  as  the  set  of  all  rows  in  A  corresponding  to 
active  QP  constraints  at  p0 
i  <-  0 

repeat 

H  Aj  \  (  pi  \  _  f  -g-  Up* 


’ompute  fi  from  f  )  (  *.  )  =  ( 


0 


li  *-  min(l,infj{--C-^—  |  afpi  <  o}) 

Pi+ 1  Pi  +  HPi 

Set  Ai+ 1  to  be  the  active-set  matrix  at  pi+1 
i  <—  i  +  1 

until  (p,-,7rj)  satisfy  (2.5) 

p*~  Pi 
7T  <—  7T,' 
if  7T  >  0 

P  «“  P 

else 

Define  v  to  satisfy:  j|uj|  =  1,  v  >  0,  VjKj  <  0  Vi,  vTw  <  0b  mirij  7Tj 
Compute  d  by  solving:  min{d7d  |  Aid  =  u} 
d  -  d/\\d\\ 

( g  +  Hp)Td  .  t  f  Cj+afp  t  Tj  \ 

, infj-|  j.  |  dj  d  <  Oj ,  "yw J 


min 


(J 


dTHd 
if  IIP  +ld\\  >  ||p|| 

P  *-  p  +  -yd 

else 

P-P 

end  if 
end  if 


aj  d 
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Some  comments  on  this  procedure  are  presented  below. 

•  An  initial  feasible  point  p0  of  the  QP  subproblem  is  obtained. 

When  the  minimizer  of  the  QP  is  used  as  the  search  direction,  then,  given  the  unique¬ 
ness  of  p,  the  choice  of  po  is  irrelevant.  If  we  determine  the  search  direction  from  a 
stationary  point  that  is  not  a  minimizer,  the  sequence  of  stationary  points  that  we 
compute  depends  directly  on  the  value  of  po-  We  wish  to  define  the  initial  point  in 
such  a  manner  that  all  stationary  points  are  satisfactory  points  at  which  to  terminate 
the  solution  process.  It  will  be  seen  that  the  following  conditions  on  po  are  sufficient 
to  ensure  our  objective. 


-  For  some  constant  /3P  >  C, 


IIP  II  <  0p\\c~\\  and  gTp0  <  (3p\\c  ||. 


(2.18) 


•  A  sequence  of  feasible  descent  steps  are  taken,  for  example,  by  first  computing  the 
step  pi  to  the  minimizer  of  the  QP  on  the  current  working  set  as  the  solution  of  the 
system  of  equations 


H  AJ 

Ai  0 


-9- 

0 


(2.19) 


where  p,  is  the  current  estimate.  A  step  7,  is  taken,  where  7,  is  obtained  as  either  one 
or  the  step  to  the  nearest  constraint, 


7,  =  min  (l,  inf  { — 2  \_2  P'  |  ajp,  <  o}). 

\  j  t  a‘pi  3  >> 


(2.20) 


The  QP  algorithm  may  be  terminated  at  any  stationary  point  p.  (Algorithm  SD  is 
terminated  at  the  first  stationary  point.)  It  will  be  seen  in  the  proofs  that  to  always 
use  p  as  the  search  direction  will  not  in  general  ensure  convergence. 

•  If  p  is  the  minimizer  of  the  QP  subproblem,  that  is,  if  x  >  0,  the  search  direction  p  is 
defined  as  p  =  p,  otherwise 


x7 d  if  ||p|i  <  ||p  +  7dl|, 

p  otherwise, 


(2.21) 


where  the  vector  d  and  the  scalar  7  are  computed  with  the  following  properties: 

-  d  is  feasible  with  respect  to  the  active  QP  constraints  at  p,  A,d  >  0.  and  it  has 
unit  norm,  ||d||  =  1. 

-  The  rate  of  descent  along  d  is  sufficiently  large.  Specifically,  we  require 


(Hp  +  g)Td  <  fom’in 


(2.22) 


for  some  positive  constant  /?(,. 
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There  are  many  procedures  for  computing  a  sui'able  vector  d\  we  now  describe 
one  such  procedure  (see  algorithm  SD).  It  proceeds  by  defining  an  auxiliary 
vector  v  with  the  following  properties 

||u||  =  1,  v  >  0,  VjiTj  <  0  Vj,  vTx  <  fa  min7Tj; 

i 

such  a  vector  can  be  obtained  for  example  by  letting 

_  /  1  if  <  0 

1  ]  0  otherwise 


We  then  compute  d  the  least-length  solution  of  A{y  —  v  and  set 

d=d/\\d\\. 

-  The  scalar  7  is  given  by 

7  =  min(7,7,7M), 

where  7M  is  a  specified  upper  bound  on  the  steplength, 


is  the  largest  feasible  step  from  p  along  d ,  and 

.  _  (g  -f  Hp)Td 
7_  dTHd  ’ 

is  the  step  to  the  minimizer  of  ip(p  +  7 d). 


(2.23) 

(2.24) 


(2.25) 


(2.26) 


The  multiplier  estimates 

Equation  (2.6)  defining  the  search  direction  on  the  multiplier  space  requires  the  compu¬ 
tation  of  an  estimate  pk  for  the  Lagrange  multipliers.  The  estimates  {pk}  are  then  used  to 
update  {A*},  the  Lagrange  multiplier  estimate  used  in  the  merit  function.  To  allow  flexi¬ 
bility  in  algorithm  design  we  have  chosen  to  specify  conditions  on  the  multipliers  estimates 
Pk  rather  than  give  explicit  definitions. 

It  will  be  shown  that  the  following  conditions  on  pk  are  sufficient  to  ensure  that  the 
algorithm  is  globally  convergent. 

MCI.  The  estimates  pk  are  uniformly  bounded  in  norm,  that  is  \\pk\\  <  <  00. 

MC2.  The  complementarity  condition  pJ.(AkPk  +  Cfc)  =  0  is  satisfied  at  all  iterations. 

We  may  satisfy  these  conditions  by  choosing  pk  =  0.  Condition  MC2  is  made  for 
convenience;  condition  MCI  and  the  form  in  which  the  multiplier  estimates  are  updated 
imply  that  are  uniformly  bounded. 


Lemma  2.1.  If  condition  MCI  holds,  then  ||A^||  <  P „  for  all  k. 
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Proof.  The  proof  is  by  induction.  We  select  to  satisfy  ||Ao||  <  (3^.  From  (2.17), 

Afc+i  =  Xk  +  M/i*  -  *  >  0.  (2.27) 

Using  norm  inequalities  and  0  <  a*,  <  1,  we  have 

Pk+i||  <  ojtll/ijbll  +  (1  -  afc)||A*||  <  a, fc/3M  +  (1  -  afc)/3M  =  /3M, 


as  required.  | 

Second-order  information 

We  choose  the  matrices  {Hk}  to  be  positive  definite  and  bounded,  with  bounded  condition 
number.  In  practice,  such  matrices  may  be  generated  (see  [GMSW86a])  by  updating  a 
quasi-Newton  approximation  to  the  Hessian  of  the  Lagrangian  function  or  the  Hessian  of 
the  augmented  Lagrangian  function  in  each  iteration  together  with  certain  safeguards  (for 
example,  if  the  factors  of  Hk  are  updated,  by  enforcing  bounds  on  the  size  of  the  elements, 
and  ensuring  sufficiently  positive  diagonal  elements).  These  conditions  can  be  written  as 
follows: 

HC1.  /3ivh  <  oo  is  the  largest  eigenvalue  of  {Hk}. 

HC2.  >  0  is  the  smallest  eigenvalue  of  {//*}. 

3.  Global  convergence  results 

The  results  in  this  section  establish  global  convergence  properties  for  algorithm  ETSQP. 
We  first  introduce  the  assumptions  under  which  we  shall  show  convergence,  and  then  we 
prove  the  following  results: 

•  The  iterates  {x*}  lie  on  a  compact  set. 

-  In  Lemma  3.1  we  show  that  the  quantities  associated  with  the  algorithm  are  well 
defined  at  all  points. 

-  In  Lemma  3.2  it  is  shown  that  if  ||xfc||  is  large  then  ||pjt||  cannot  be  arbitrarily 
small. 

-  In  Lemma  3.3  we  show  that  p  computed  using  algorithm  SD  satisfies 

Hp)  =  9TP  +  \pT H P  <  ~0iPTHp  +  /?2|| c  -  s||, 

where  f3\  and  02  are  positive  constants. 

-  Lemma  3.4  proves  that  the  sequence  {i*}  lies  on  a  compact  set. 

-  Lemma  3.5  shows  that  the  sequence  {p,t}  also  remains  bounded. 

•  The  sequence  {||p*||}  dominates  the  sequence  { ||xjt  -  z*||},  where  x*  denotes  a  KKT 
point  closest  to  x*.  The  main  implication  of  this  result  is  that  ||pj;||  — »  0  is  sufficient 
to  ensure  that  x*  — ^ >  x*,  a  KKT  point  of  NP. 
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-  It  is  shown  in  Lemma  3.6  that,  under  the  assumptions  we  make,  the  KKT  points 
for  problem  NP  are  isolated. 

-  Lemma  3.7  introduces  another  preliminary  result,  proving  that  if  pk  — ►  0  along 
a  subsequence  then  along  this  subsequence  ||xjfc  -  x*|j  — >  0,  where  x*  is  a  KKT 
point  for  NP  nearest  to  Xk .  Moreover,  for  large  enough  k,  pk  is  the  minimizer  of 
the  QP  subproblem,  and  the  correct  active  set  at  x*  is  identified. 

-  The  proof  that  ||p*:||  dominates  ||x*;  -  x*||  is  given  in  Lemma  3.8. 

•  Bounds  on  the  growth  of  the  penalty  parameter  pk- 

We  cannot  prove  that  pk  will  remain  bounded  in  the  algorithm  without  stronger  con¬ 
ditions  on  the  multiplier  estimate  pk,  but  we  can  show  that  its  growth  is  bounded  by 
certain  quantities  related  with  the  algorithm,  and  that  is  enough  to  prove  convergence. 

-  We  show  in  Lemma  3.9  that  at  all  the  iterations  where  the  penalty  parameter  is 
modified  the  following  bounds  hold, 

Pk\\ck  ~  s*||  <  N  and  pjtl|pfc||2  <  N. 

-  In  Lemma  3.10  and  Lemma  3.11  we  show  that  similar  inequalities  hold  at  all 
iterations. 

•  The  steplength  a*  is  bounded  away  from  zero  if  we  are  not  close  to  a  solution. 

-  We  first  need  a  bound  on  the  second  derivatives  of  4>{a).  In  Lemma  3.12  we 
prove  that  <j>k{ctk)  <  N  for  some  positive  constant  N. 

-  In  Lemma  3.13  we  show  that,  if  ||pjfc||  is  large  enough,  there  exists  a  value  a  >  0 
independent  of  the  iteration  such  that  a*  >  a. 

•  In  Theorem  3.1  we  show  that  Xk  -*  x*. 

•  Finally,  we  prove  that  A*  — ►  A*. 

-  This  result  requires  stronger  conditions  on  the  multiplier  estimate  pk  than  just 
MCI  and  MC2.  We  start  by  introducing  a  third  condition  MC3. 

-  Lemma  3.14  strengthens  the  result  in  Lemma  3.13  showing  that,  under  the  new 
conditions  on  the  multipliers,  a*  is  uniformly  bounded  away  from  zero. 

-  In  Theorem  3.2  we  show  that  A*  — ►  A*. 

Assumptions 

Some  of  the  following  assumptions  make  use  of  the  concepts  of  stationary  points  and  KKT 
points  at  infinity.  We  will  say  that  NP  has  a  stationary  point  at  infinity  if  there  exist 
sequences  {ik}  and  { 77* }  such  that  ||zfc||  — »  00  and/or  \\rjkW  — ►  00,  and 

ck  “  9k  -*■  0,  ifjck  -*•  0. 

If  in  addition  to  these  conditions  we  also  have  %  —*  0  ,  then  we  have  a  KKT  point  at 
infinity. 

We  make  the  following  assumptions: 
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Al.  For  some  constant  /3C  >  0,  the  global  minimum  of  the  problem 

minimize  Fix) 
x€»n 

s.t.  c(z)  >  -/?ce, 

is  bounded  below. 

A2.  There  exist  no  KKT  points  at  infinity  for  problem  NP. 

A3.  F,  cj  and  their  first  and  second  derivatives  are  continuous  and  uniformly  bounded  in 
norm  on  a  compact  set. 

A4.  The  Jacobian  corresponding  to  the  active  constraints  at  all  KKT  points  has  full  rank. 
A5.  A  feasible  point  p ^  exists  to  all  the  QP  subproblems,  satisfying 

iipfcoii  < /yc*n  and  9k?k0  < M\c-k\\ 
for  some  constant  /3P  >  0. 

A6.  Strict  complementarity  holds  at  all  stationary  points  of  NP,  including  stationary  points 
at  infinity,  if  they  exist. 

A7.  The  reduced  Hessian  of  the  Lagrangian  function  is  nonsingular  at  all  KKT  points. 

The  larger  the  value  of  /?c,  the  stronger  is  assumption  Al.  There  will  be  problems,  for 
example  F(x )  =  f(x)Tf(x),  where  it  is  known  a  priori  that  assumption  Al  holds  with 
f3c  =  oo.  Also,  if  Al  does  not  hold  with  (3C  =  0  then  it  is  possible  for  any  reasonable 
algorithm  to  diverge. 

Assumption  A5  imposes  conditions  on  the  initial  point  for  the  QP.  It  is  possible  that 
no  point  satisfies  these  conditions;  this  would  be  the  case  for  example  if  one  of  the  QP 
subproblems  generated  by  the  algorithm  is  not  feasible.  Nevertheless,  by  introducing  an 
additional  variable  it  is  possible  to  construct  a  modified  problem  for  which  this  condition 
is  satisfied  trivially.  Consider  the  problem 

minimize  F(x,x)  =  (1  -  w)F(ar)  +  wi 
(x,x)e»n+1 

s.t.  c(x )  +  xe  >  0  and  x  >  0, 

where  x  €  &  and  u  6  [0, 1].  The  KKT  points  for  this  problem  are  also  KKT  points  for  NP 
if  NP  is  feasible  and  to  is  sufficiently  close  to  one.  The  modified  problem  is  always  feasible, 
and  the  corresponding  QP  subproblem  takes  the  form 


s.t.  ck  +  AkP  +  xke  +  pe  >  0 

ik  +  P  >  0. 


minimize  (1  -  u)g7p  +  up  -f  A  (  pT  p  )  Hk  I  ? 
( P,p)e* "+1  *  2  ^  >  \P 


For  this  QP  subproblem  the  point 


Po 


(  P  \  =  (  0  \ 

V  P  )  \  ll(Cfc  +  ike)  ||oo  ) 
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is  feasible  since  we  can  ensure  that  X).  >  0.  Therefore  there  always  exists  a  feasible  point 
that  satisfies  A5  with  0P  =  1  since  ||p0||  =  ||(cfc  +  *fce)“||00  and 

v^itPo  =  (  (1  -w)gl  u  )  ^  j  =  wIKcfc  +  ifcenioo  <  ||(cfc  +  x*:e)_||0o, 
implying  that  Assumption  A5  is  redundant  for  the  modified  problem. 

Existence  of  the  iterates 

We  start  by  showing  that  all  the  quantities  associated  with  the  algorithm  are  well  defined. 
In  particular,  we  show  that  the  choice  of  penalty  parameter  ensures  (2.11)  is  satisfied  and 
that  the  steplength  exists. 

Lemma  3.1.  Under  assumptions  A3,  A5  and  conditions  HCl,  HC2,  the  procedures  given 
in  the  algorithm  to  compute  the  values  of  the  penalty  parameter  pk  and  the  steplength  a* 
are  well  defined. 

Proof.  We  drop  the  subscript  k  denoting  the  iteration  number,  to  simplify  the  notation. 

Consider  the  gradient  of  the  merit  function  LA,  defined  in  (2.1),  with  respect  to  x,  A 
and  s, 

'  g(x)  -  A(i)tA  +  pA(x)r(c(x)  -  s)  \ 

VLA(x,X,s)  =  -(c(x)-s)  I.  (3.1) 

t  A  -  P(c(x )  ~  «)  / 

It  follows  from  (2.6),  (2.10)  and  (2.2)  that  <f>'( 0)  is  given  by 

d>'(0)  =  pTg  -  pT At X  +  ppTAT(c  -  s)-  (c-  s)t£  +  A Tq  -  pqT(c  -  s) 

=  pTg  +  (2A  -  p)T(c  -  s)  -  p\\c  -  s||2,  (3.2) 

where  g,  A,  and  c  are  evaluated  at  x. 

If  ||c  -  a(|  =  0,  from  (2.9)  and  (2.18)  we  have  p0  =  0,  and  since  ip(p)  =  pTg  +  \pTHp  < 
V’(po)  =  0  it  follows  that 

<t>'( 0)  =  pTg  <  -\pT Hp, 

implying  that  p  does  not  need  to  be  modified. 

If  ||c  -  s||  >  0,  we  obtain  from  (3.2)  that  for  p  ~  p  (defined  in  (2.13)) 

<t>\0)  =  9TP  +  (2A  -  p)T(c  -  s)  -  p\\c  -  s||2  =  -\pTHp, 

which  implies  the  desired  descent  condition  (2.11)  is  satisfied  for  all  p  >  p. 

An  immediate  consequence  of  (2.11)  and  the  properties  of  H ^  is  the  following  bound  on 
the  directional  derivative: 

44(0)  <  -\M\Pkf-  (3.3) 

It  follows  from  the  procedure  to  increase  the  value  of  the  penalty  parameter  (see  (2.12)) 
that  pk  — >  oo  if  and  only  if  the  parameter  is  increased  an  infinite  number  of  times. 
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We  also  need  to  prove  that  the  value  of  a*,  introduced  in  the  algorithm  is  well  defined. 
We  show  that  if  condition  (2.14)  is  not  satisfied,  a  steplength  d*  6  (0,1)  that  satisfies 
conditions  (2.15)  always  exists  (see,  for  example,  More  and  Sorensen  [MS84]). 

Define  the  functions 


X(o)  =  <t>(a)  -  <£(0)  -  oa<f>'{ 0) 

C(«)  s  <ff  (a)  -  T)<f>'(0), 

and  note  that  from  a  <  t)  and  4>'(Q)  <  0,  implied  by  (2.11),  we  have 

x'(a)  =  <ff{ o)  -  o<f>\ 0)  <  <ff (a)  -  T)<f>( 0)  =  C(a),  (3.4) 


for  any  a. 

If  (2.14)  does  not  hold, 

<£(1)  -  <f>( 0)  >  <r<j>'(0)  =>  x(l)  >  o, 

and  we  also  have  x(0)  =  0.  From  these  two  results  and  the  mean-value  theorem,  there  will 
be  a  point  d  €  [0, 1]  such  that  x7(<*)  >  0,  and  from  (3.4),  C(a)  >  0. 

From  ^(0)  <  0  we  have  C(0)  <  0,  and  the  continuity  of  £  (assumption  A3)  will  imply 
the  existence  of  a  zero  of  (  in  (0,a).  Let  d  denote  the  smallest  point  in  (0,d)  such  that 
C(d)  =  0,  that  is, 

<£'(«)  =  ttf'(O),  (3.5) 

and  condition  (2.15b)  is  satisfied  at  a. 

From  C(0)  <  0  we  must  have 

C(«)  <  0  Va  €  (0,d)  <f>\ot)  <  t?0'(O)  Va  6  [0,a),  (3.6) 

implying  that  condition  (2.15b)  is  not  satisfied  for  any  point  in  [0,  d). 

Finally,  from  (3.4)  and  (3.6),  we  have 

X'(«)  <  0  Vae[0,d), 

and  this  together  with  x(0)  =  0  implies  x(®)  <  0,  that  is, 

4>{a)  -  #0)  <  <Ta<ff(0),  (3.7) 

showing  that  d  satisfies  both  conditions  (2.15)  simultaneously. 

We  still  need  to  consider  condition  (2.16).  For  the  function  h(a)  =  c(x  +  ap)  +  f3ce  we 
have  from  (2.4b) 

h'( 0)  =  Ap  >  -c. 

If  -\(5C  >  Cj  >  ~(3C,  we  have  hj(0)  >  0  and  h!-{ 0)  >  \(5C  >  0;  if  c3  >  -\0C  then  hj( 0)  > 
\0C  >  0  and  in  any  case  there  exists  a  value  a  >  0  such  that  hj(a)  >  0  (implying  Cj(x+ap)  > 
—0c)  for  all  j  and  all  a  €  [0,  a],  implying  that  for  a  6  [0,  min(d,a)]  both  conditions  (2.15a) 
and  (2.16)  hold  simultaneously.  | 

This  lemma  implies  that  all  the  quantities  associated  with  the  algorithm  are  well  defined. 
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Boundedness  of  the  iterates 

To  prove  global  convergence  we  show  first  that  if  assumptions  Al  and  A2  hold,  all  points 
in  the  sequence  {ifc}  generated  by  the  algorithm  lie  on  a  compact  set.  We  start  by  showing 
that  for  ||zfc||  large  enough  we  cannot  have  ||pfc|j  arbitrary  small. 

Lemma  3.2.  Under  assumptions  A2  and  A6  and  condition  HC1,  there  exist  positive  con¬ 
stants  M  and  e  such  that  ||a:jt||  >  M  =>  ||pfc||  >  c. 

Proof.  Assume  this  result  does  not  hold.  Then,  for  any  positive  constants  M  and  e  we 
can  find  iterates  such  that  Hifcll  >  M  and  ||pfc||  <  e,  and  we  could  construct  a  sequence 
{a;*:},  and  its  associated  sequence  {p*},  along  which  ||x^||  — ►  oo  and  ||pjt||  — >  0.  For  this 
sequence,  from  ||p*||  — ►  0  and  (2.4b),  we  must  have  ||c^||  — >  0.  Also,  from  the  definition  of 
Pfc,  (2.21),  it  must  hold  that  ||pjt||  — ►  0,  and  from  (2.5a)  and  MCI,  we  must  have 

\\Aknk-gk\\  =  -*■  0. 

Since  ||pfc||  — »  0  and  ||pfc||  — ►  0,  using  (2.21)  and  ||djt||  =  1,  we  also  have  either  7*  — *  0 
or  7*  =  0  for  k  large  enough.  It  then  follows  from  (2.24)  that  either  min(7*,  7*)  -»  0  or 
7fc  =  7fc  =  0  for  k  large  enough.  If  7*  — >  0  along  a  subsequence,  then  (2.25)  implies  for  some 
constraint  j  that  (7 rk)j  — >  0  and  Cj(xk)  ->  0,  but  this  would  contradict  assumption  A6.  If 
yk  — »  0  along  a  subsequence,  then  from  (2.26)  and  (2.22)  we  get  ||  — ►  0. 

The  properties  of  this  sequence, 

Kll-O,  \\Akirk  -  pA:||  -*•  0, 

and  either  ||7r^||  — ►  0  or  ||7rj“||  =  0  for  k  large  enough  imply  there  exists  a  KKT  point  at 
infinity,  which  violates  assumption  A2,  so  the  lemma  must  hold.  | 

Another  result  we  need  for  the  compactness  proof  is  a  bound  on  the  value  of  the  QP 
objective  function  at  the  incomplete  solution  for  the  QP. 

Lemma  3.3.  Under  assumption  A5  and  conditions  HC1,  HC2,  for  p  computed  by  algo¬ 
rithm  SD  there  exist  constants  0i  >  0  and  >  0  such  that 

V<P)  =  9TP  +  \pTHp  <  -Pi pTHp  +  p2\\c  -  s||. 

Proof.  The  result  will  be  shown  by  considering  first  the  initial  point  for  the  QP,  po,  and 
then  the  descent  achieved  in  each  QP  iteration. 

By  definition 

V’(Po)  =  -\poHpo  +  gTpo  +  PoHpo- 

Since  ||p0||  <  /3p|(c— 1|  and  gTpo  <  /3p||c“||  (assumption  A5),  condition  HC1  on  H  implies 

V>(Po)  <  -\plHpo  +  /3p||c"||  +  PivhPI\\c~\\2.  (3.8) 

Consider  the  quadratic  function  67  -f  \cy2,  where  6  <  0  and  c  >  0;  then  for  all  7  £ 
[0,  -b/c]  (between  0  and  the  minimizer),  we  have 

7  <  --  =>  7(6  +  c7)<0  =»  by  +  \cy2  <  -icy2, 

c  i 


(3.9) 
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The  change  in  the  QP  objective  function  at  any  intermediate  QP  iteration  i  can  be 
represented  as 

V>(Pi+ 1)  -  1>(Pi)  =  \lfdjHdi  +  7 i(g  +  Hpifdi,  (3.10) 

where  d{  is  used  to  denote  the  QP  step  obtained  from  (2.19)  or  the  final  step  d  defined  in 
(2.23),  and  7;  is  a  feasible  steplength  bounded  by  the  steplength  to  the  minimizer  along  Vi,  as 
defined  in  (2.20)  or  (2.24).  We  have  dj Hdi  >  0  (from  condition  HC2)  and  ( g  +  Hpi)Td{  <  0 
(from  condition  (2.22)  and  min^Tr,  <  0),  implying  that  we  can  apply  the  bound  (3.9)  to 
(3.10)  to  obtain 

V>(P«+i)  -  i’iPi)  <  ~h fdjHdi.  (3.11) 

If  we  have  taken  N  iterations  to  compute  p  (the  search  direction),  by  adding  the  in¬ 
equalities  (3.11)  for  i  =  0, . . . ,  N  and  using  (3.8)  we  obtain 

N 

i>{p)  =  V>(po)  +  2>(P*) ~  V'(p.-i)) 

1—1 

N 

<  -\{pTQHpo  +  YslidjHdi)  +  0p\\c~\\  +  /Wp2|IO|2.  (3.12) 

1=1 

We  can  use  the  convexity  of  the  function  pT Hp  (implied  by  property  HC2)  to  write 

N  1  N  N  , 

Po  #  Po  +  X 'yfdf Hdi  >  —  (po  +  X  lidifH (Po  +  X  =  1^777  7>ThP- 

i=l  ~r  l  t=1  ,=i  +  1 

This  result  together  with  (3.12)  implies 

V<P)  <  ~2(^  +  1)  PTHP+Pr Hc~II  +  PlvHP$\\c~\\2.  (3.13) 

Since  c~  >  /3ce  the  desired  result  follows  from  this  inequality  and  (2.9).  | 

We  can  now  prove  the  main  result  of  this  section. 

Lemma  3.4.  Under  assumptions  Al,  A2,  A3,  A5  and  A6,  and  conditions  MCI,  HCl 
and  HC2,  the  sequence  {a:*;}  generated  by  the  algorithm  lies  on  a  compact  set. 

Proof.  First  we  show  the  set  of  points  at  which  the  penalty  parameter  is  modified  lies  on 
a  compact  set.  If  pu  remains  bounded  it  follows  from  the  manner  the  penalty  parameter  is 
modified,  (2.12),  that  there  is  only  a  finite  set  of  such  points.  Therefore  we  need  only  study 
the  case  when  pk  —*  00.  Consider  the  iterations  k  where  the  penalty  parameter  is  modified. 
From  condition  MCI  and  the  boundedness  of  the  multiplier  estimates  A k  (Lemma  2.1),  we 
have 

||2Afc  -  /«fc||  <  2||Afc||  -I-  |M  <  3/V  (3.14) 

This  result,  together  with  the  definition  of  the  penalty  parameter  (2.13),  and  Lemma  3.3 
gives 


Public*  -  sfc||2  <  glpk  +  \p\HkPk  +  (2A*  -  pk)T(ck  -  *k) 
<  (Pi  -(-  3/3^)j| ck  -  Sfcll  -  faplHkPk- 


(3.15) 
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As  we  have  assumed  pk  — »  oo,  (3.15)  implies  |fc*  -  S/t||  -*  0,  and  from  (2.9)  also  j|cfc  j|  — *  0. 
From  Lemma  3.3  and  (3.14)  we  have 

Uk  =  9kPk  +  (2Afc  -  p,k)T{ck  ~  Sk)  (3.16a) 

<  -\plHkPk  ~  PiPkffkPk  +  (ft  +  30M)||cfc  -  sfc||.  (3.16b) 

If  M  >  e  >  0  along  an  infinite  subsequence,  then  it  follows  from  ||cfc  —  s^||  — ►  0  and 
MC2  that  there  exists  an  index  K  such  that  for  all  k  >  K  in  the  subsequence, 

(0P  +  30M)||cfc  -  sfc||  <  0i  PkHkPk- 


From  (3.16b)  we  obtain  the  following  bound  on  uk, 

Uk  <  -^PkHkPk,  (3-17) 

for  k  >  K.  From  (3.16a)  and  the  bounds  (3.17)  and  (3.2),  we  have  for  sufficiently  large  k 

<P'k(0)  =  <*>*-  Pk\\ck  ~  ^||2  <  uk  <  -\plHkPk- 


This  last  inequality  implies  that  pk  is  not  modified  for  all  k  >  K,  which  contradicts  our 
assumption  that  the  penalty  parameter  was  modified  an  infinite  number  of  times. 

We  have  shown  that  ||pjt||  —►  0  along  the  subsequence  at  which  the  penalty  parameter 
is  modified.  The  boundedness  of  ||x*;||  along  this  subsequence  follows  from  Lemma  3.2. 

We  now  consider  those  points  corresponding  to  iterations  where  the  penalty  parameter 
is  not  modified.  From  condition  (2.16)  on  the  linesearch  and  assumption  Al,  we  have 
F(xk)  >  Pf  >  -oo  for  all  k.  Also,  from  Lemma  2.1  |]A^||  is  bounded,  implying  that 


L*{Xk,Xk,Sk,Pk) 


>  -oo. 


(3.18) 


Since  ||a;fc||  is  bounded  when  pk  ^  Pk- 1  and  LA(ik,^k,Sk,Pk)  is  reduced  when  pk  =  Pk- i  it 
follows  that  LA(xk,^k,Sk,Pk)  is  bounded.  Moreover,  the  reduction  in  LA(xk,^k,Sk,pk)  is 
bounded  for  a  sequence  of  iterations  for  which  pk  is  not  changed.  Let  /  denote  the  index 
at  which  pk  is  modified  and  let  /  <  k  <  K  denote  the  iterates  for  which  pk  remains  fixed. 
It  follows  from  the  above  reasoning  that  there  exists  N  such  that 


K 

<t>l~  4>K  =  5 Z^k  -  <Pk+ 1)  <  N, 

k=l 


(3.19) 


where  to  simplify  the  notation  we  have  used  <j>k  =  <fo(0). 

From  the  termination  condition  for  the  linesearch  (2.15a),  (3.3)  and  (3.19),  we  also  have 


K  K 

2 vfisvH  ^2  a*llp*!l2  ^  -  tk+1 )  <  N.  (3.20) 

k=i  k=i 

This  result  implies  that  afc||pt||  is  bounded.  Hence  if  ||ijt||  is  not  bounded  there  must  exist 
sets  of  iterates  with  indices,  say  S|  +  1  <  k  <  r;  for  l  =  1,2,...,  such  that  ||x3|||  <  M, 
||a;*||  >  M  for  M  large  enough,  limj_oo  r*  =  oo,  and  limi-,*,  ||xr|  ||  — ♦  oo.  It  follows  that  if 
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M  is  chosen  so  that  M  >  max{||x/||)  then  Pk  is  constant  in  the  interval  s/  <  k  <  ri.  The 
existence  of  an  index  such  that  ||xa|||  <  M  is  assured  since  we  have  ||x/||  <  M  and  at  least 
one  index  in  the  interval  for  which  ||x*||  >  M.  From  these  assumptions  and  definitions  it 
follows  that 

r(-l 

£ajfc||Pfc||>  ||*n-*a(||-+oo.  (3-21) 

k=si 

It  follows  from  Lemma  3.2  that  ||pfc||  >  e  for  si  +  1  <  k  <  ri.  From  (3.21)  we  get 

r(-l  rj— 1 

52  “ilbill2  > e  52  ai\\pi\\  +  a*«lb*ill2  -*■ °°> 

j=si  i=»i+l 

but  this  contradicts  (3.20),  implying  that  the  points  generated  by  the  algorithm  must  lie 
on  a  compact  set.  | 

To  complete  this  section,  we  show  that  the  search  direction  computed  from  the  QP 
subproblem  is  bounded. 

Lemma  3.5.  Under  the  assumptions  of  Lemma  3.4,  the  sequence  {Pk\  is  bounded. 

Proof.  We  drop  the  subscript  k  in  the  proof. 

As  all  the  steps  taken  in  the  solution  of  the  QP  subproblem  are  descent  steps,  we  have 
from  (2.3), 


'i!){po)>'>i>{p)  =  gTp+\pTHp=\\\H*p+H  *g\\\-\gTH  lg, 

implying  from  HC2  and  ||a||  <  ||a  +  6||  +  ||6||, 

V^raiPlI  <  II* *P||  <  \\H-lg\\  +  || Hip  +  H-±g\\  <  \\H-$g\\  +  ^(Po )  +  gTH~'g. 

The  boundedness  of  ||p||  follows  from  this  result  Lemma  3.4,  conditions  HCl  and  HC2  and 
the  bound  (3.8).  | 

It  is  tacitly  assumed  in  the  remaining  proofs  that  the  assumptions  A1-A7  and  condi¬ 
tions  MCI,  MC2,  HCl  and  HC2  hold. 

The  sequence  of  search  directions  {p*} 

In  this  section  we  relate  the  behavior  of  the  sequence  {x*  —  x*},  where  x*  denotes  a  KKT 
point  closest  to  x*,  to  that  of  the  sequence  {pk}-  In  particular,  we  show  that  ||pfc||  — *  0 
implies  x*  — *  x* ,  and  so  it  is  enough  to  prove  that  ||pfc||  — »  0  to  establish  global  convergence. 

Although  the  KKT  point  x*  introduced  above  may  not  be  unique,  the  assumptions  made 
on  the  problem,  and  more  specifically  assumption  A7,  imply  that  if  ||xjt  -  x*||  is  sufficiently 
small  then  x*  is  unique,  as  the  following  Lemma  shows.  This  result  allows  us  to  work  with 
a  well-defined  sequence  {x^  —  x*},  at  least  close  to  a  KKT  point;  it  will  also  imply  that  the 
limit  point  of  the  sequence  generated  by  the  algorithm  is  unique. 

Lemma  3.0.  The  KKT  points  for  problem  NP  are  isolated. 
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Proof.  Assume  that  the  result  does  not  hold,  and  let  x*  denote  a  KKT  point  for  NP 
that  is  not  isolated,  that  is,  for  any  e  >  0  there  exists  a  KKT  point  ye  ^  x*  satisfying 
||x*  -  yd |  <  €.  Consequently,  there  exists  a  sequence  {y*}  such  that  yk  is  a  KKT  point  for 
all  k,  yk  ±  x*  and  yk  -*  x*. 

For  sufficiently  small  ||x*  -  ydl  the  active  sets  at  yk  and  x*  must  be  the  same;  otherwise 
we  would  have  for  some  constraint  j  that  Cj(x*)  =  0  with  both  Cj(yk)  >  0  and  (A k)j  =  0 
along  some  subsequence,  where  A*  is  the  multiplier  vector  at  yk.  From  assumptions  A3 
and  A4  and  (1.1)  we  have  A*  — ►  A*,  the  multiplier  vector  at  x*,  but  this  would  imply 
cj(x*)  =  A*  =  0,  contradicting  assumption  A6. 

Let  Zk  denote  a  basis  for  the  null-space  of  Vc(yk),  the  Jacobian  of  the  active  constraints 
at  yk,  and  Z*  denote  the  corresponding  basis  at  x*.  Among  all  possible  bases,  Zk  is  selected 
to  have  continuous  first  derivatives  in  a  ball  around  x* .  It  follows  from  A4  and  the  fact  the 
active  set  is  constant  that  such  bases  exist. 

For  any  element  of  the  sequence  yk  and  for  x*  we  have  from  (1.1) 

Z^VF(yk)  =  0  and  Z*TVF{x*)  =  0. 

The  Taylor  series  expansion  of  ZjVF(yk)  around  x*  gives 
0  =  ZlVF(yk)  =  Zl(VF(yk)  -  Vc(yk)T A*) 

=  Z*T(VF(x*)  -  Vc(x*)TA *)  +  (VZ(x*)(VF(z*)  -  Vc(x*)T A*) 

+  Z*TV2L(x*,  A*))(yfc  -  x*)  +  o(||y*  -  **||),  (3.22) 

where  L(x,  A)  is  the  Lagrangian  function  of  NP.  Using  (1.1)  in  (3.22),  and  dividing  by 
II yk  -  x*\\  gives 

Z*TV2L(x*,\*)6k  =  o(l),  where  Sk  =  ■  (3.23) 

Let  c  denote  the  subset  of  constraints  active  at  x*  and  yk ■  If  €  is  sufficiently  small  then 
6k  satisfies 

c(yt)  =  0  =  Vc(x*)(y*  -  z*)  +  o(||yfc  -  x*||)  =>  Vc(x*)h  =  o(l)-  (3.24) 

Finally,  for  any  convergent  subsequence  of  the  bounded  sequence  {£*},  with  limit  £,  we 
have  from  (3.23)  and  (3.24), 

Z*TV2L(x*,\*)6  =  0,  Vc(x*)^  =  0, 

contradicting  assumption  A7.  | 

This  result,  together  with  A2,  implies  that  the  number  of  KKT  points  lying  on  any 
compact  region  is  finite.  The  distinctness  and  finiteness  of  the  KKT  points  implies  the 
existence  of  e*  >  0  such  that  for  any  two  KKT  points,  say  x\  and  x*2,  we  have  ||x* -x*||  >  2c+ 
It  follows  that  if  ||x*  -  x*||  <  c*,  where  x*  is  a  KKT  point  nearest  to  xk,  then  x*  is  unique. 

We  now  analyze  the  sequence  of  search  directions  {pfc}.  The  following  result  shows  that 
as  pk  —*  0  we  get  close  to  KKT  points  of  NP  and  we  only  need  to  consider  values  pk  obtained 
as  the  minimizers  for  the  corresponding  subproblems.  We  complete  this  result  by  showing 
that  a  small  value  of  ||pfc||  also  implies  that  the  correct  active  set  at  x*  is  identified,  in  the 
sense  that  the  active  QP  constraints  at  pk  correspond  to  the  active  NP  constraints  at  x* . 
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Lemma  3.7.  If  along  a  subsequence  pk  — »  0  then  along  this  subsequence  ||xjt  -  x*||  — ►  0, 
where  x*  is  a  KKT  point  nearest  to  Xk-  For  k  large  enough,  x*  is  unique,  pk  is  the  QP 
minimizer  and  the  correct  active  set  at  x*  is  identified. 


Proof.  A  subsequence  such  that  pk  — >  0  exists  if  and  only  if  a  subsequence  exists  such 
that  pk  — *  0  and  the  active  set  at  pk  is  constant.  Let  {r}  denote  the  sequence  of  indices  for 
such  a  subsequence. 

From  the  definition  (2.21)  of  pT  it  follows  immediately  that  ATpT  ■+■  cr  >  0.  From  pT  — *  0 
and  assumption  A3  it  must  hold  that  c~  —> ►  0  and  pr  — »  0. 

From  (2.5)  we  have 

Ajnr  ~  9t  -  Hrpr  -  0  and  nJ(ATpT  4-  cr)  =  0.  (3.25) 

Since  pr  — *  0  it  follows  that 

Ajnr  -  gT  — ►  0,  7r JcT  — *■  0  and  c~  -*•  0.  (3.26) 

We  now  show  that  for  large  enough  r  that  pT  must  have  been  computed  as  the  minimizer 
for  the  QP.  It  follows  from  pr  — *  0  and  ||dr||  =  1  that  either  there  exists  K  such  that  for  all 
r  >  K  we  have  7r  =  0  or  7r  — >  0  (see  (2.24)).  If  we  assume  the  latter  it  follows  that 


min(7r,7r)  -»  0. 


•  If  7r  — *  0  along  a  subsequence,  then  from  (2.25)  along  this  subsequence  we  will  have 
for  some  constraint  j 

Vcj(xr)  ( pr  d*  Krdr')  d"  Cj(^r)  —  H  and  (?rr)j  —  0, 


which  implies  that 

Cj(xT)  — >  0  and  (7r r)j  =  0, 

contradicting  assumption  A6. 


•  If  7r  — >  0  along  a  subsequence,  then  from  (2.22), 


#(0) 

(IfHTdT 


o, 


which  implies  from  condition  HC1  and  ||dr||  =  1  that  tA'(0)  =  ( HTpT  +  gr)Tdr  — *  0. 
This  result  and  condition  (2.22)  on  d  imply  that  for  some  constraint  j  we  have  K)j  < 
0,  (7 rT)j  ->  0  and  Vcj{xr)Tpr  d-  Cj(xT)  =  0,  giving 


Cj(xT)  -*•  0  and  (tt r)j  ->■  0, 


and  again  contradicting  assumption  A6. 

We  conclude  therefore  that  7r  =  0  for  r  >  K  and  this  together  with  (3.25)  implies  pT  is  the 
minimizer  of  the  QP  subproblem.  For  r  large  enough  7rr  >  0,  which  together  with  (3.26) 
and  assumption  A3  implies  ||xr  —  x*||  — ►  0,  where  x*  is  the  nearest  KKT  point  to  xT.  For 
r  large  enough  x *  is  unique. 
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Finally,  we  prove  that  for  r  large  enough  the  active  set  of  the  QP  coincides  with  the 
active  set  of  NP  at  x* .  First  note  that  for  r  large  enough  the  active  set  of  the  QP  must 
be  a  subset  of  the  constraints  active  at  x*,  otherwise  pT  is  a  step  to  a  nonactive  constraint 
implying  ||pr||  >  c  >  0.  Assume  that  for  the  subsequence  we  have  Vcj(xr)pT  +  c:(xT)  >  0 
and  Cj( x*)  =  0.  From  (2.5b)  we  must  have  (n r)j  =  0,  implying  from  the  convergence  of  rcr 
that  A*  =  0,  but  this  violates  assumption  A6,  and  for  r  large  enough  the  correct  active  set 
is  known.  | 

This  result  shows  that  there  is  an  e  >  0  such  that  if  ||p*||  <  c,  then  pk  is  the  solution  of 
the  QP  subproblem,  and  the  correct  active  set  is  known. 

We  have  just  shown  that  if  Pk  — *  0  along  a  subsequence,  then  Xk  — »  x* .  To  show  pk  — ►  0, 
we  need  a  stronger  result,  giving  a  relationship  between  the  rates  of  convergence  of  the 
sequences  {xk  -  x*}  and  {p*}. 

Lemma  3.8.  If  x*  denotes  a  KKT  point  closest  to  Xk,  then  there  exists  a  constant  M  such 
that 

||xfc  -  x*||  <  M||pfc||. 


Proof.  If  ||pfc||  >  e  for  all  k  then  the  result  holds  trivially  since  ||xfc||  and  ||x*||  are  both 
bounded.  Again  let  {r}  denote  the  indices  of  a  subsequence  such  that  pr  — >  0  and  the 
active  set  at  pT  is  constant.  From  Lemma  3.7,  for  this  subsequence  we  have  ||xr  -  x*||  — ►  0. 
We  assume  for  the  rest  of  this  proof  that  r  is  large  enough  so  that  x*  is  unique,  pr  is  the 
minimizer  of  the  QP  and  the  correct  active  set  has  been  identified. 

Let  c,  A  and  i r  denote  the  corresponding  quantities  restricted  to  the  constraints  in  the 
active  set.  From  assumption  A4  we  know  that  A*  has  full  row  rank,  and  we  assume  that  r 
is  large  enough  so  that  Ar  also  has  full  rank. 

Let  ZT  denote  a  basis  for  the  null  space  of  Ar,  with  uniformly  bounded  norm  and 
continuous  first  derivatives.  From  the  optimality  conditions  for  pr,  (2.5),  we  get 


h(x)  = 


Zj(g,  -  AjX*)  \ 


(3.27) 


Since  h(x*)  =  0,  we  have  from  the  Taylor  series  expansion  that 


h3(xr)  =  S,((0r);)(*r  -  X*), 

where  Sj((9r)j)  =  V/ij(x*  +  (0T)j(xr  -  x*))  and  0  <  (0r)j  <  1-  We  have  therefore 


Zjgr 

C-T 


-S(0r)(xr  -  x*). 


(3.28) 


From  (3.22)  we  get 


5(0)  = 


(  Z*TV2L(x*,A*)  \ 

l  Mx*)  )  ’ 


and  assumptions  A4  and  A7  imply  that  5(0)  is  non-singular.  It  follows  that  for  sufficiently 
large  values  of  r,  S(6r)  is  also  nonsingular.  It  then  follows  from  (3.28)  that  for  some  positive 
constant  M\ , 


||xr  —  x* ||  <  pr||  +  ||cr||). 


(3.29) 
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From  assumption  A3,  property  HC1  and  (3.27)  it  follows  that 


M2\\pT\\>\\ZjgT\\  +  \\cr\\, 


(3.30) 


for  some  positive  constant  Af2. 

Since  the  subsequence  {pk}  such  that  pk  — *  0  is  composed  of  a  finite  number  of  subse¬ 
quences  for  which  pT  — *■  0  and  the  active  set  at  pT  is  constant,  the  required  result  follows 
from  (3.29)  and  (3.30).  | 

Bounds  on  the  penalty  parameter 

The  conditions  we  have  imposed  on  the  algorithm  (and  more  specifically  on  the  multiplier 
estimate)  are  not  sufficient  to  ensure  that  the  penalty  parameter  is  bounded.  However, 
bounds  on  pk  are  related  to  the  behavior  of  different  quantities  in  the  algorithm,  and  in 
particular  to  ||pjt|]  and  ||cfc  -  s*||.  The  following  Lemmas  introduce  bounds  on  the  size  of  p*. 
in  terms  of  these  quantities.  We  start  by  presenting  the  results  for  those  iterations  where 
the  penalty  parameter  is  modified,  and  then  we  extend  the  results  to  general  iterations. 

The  notation  ki  is  used  in  all  that  follows  to  indicate  iterations  at  which  the  value  of 
the  penalty  parameter  needs  to  be  modified. 

Lemma  3.9.  For  any  iteration  ki  in  which  the  value  of  p  is  modified, 

Ph\\ckl  -  SfcJI  <  N  and  Pfc,t|pjt,||2  <  N, 


for  some  constant  N . 

Proof.  All  quantities  in  the  proof  refer  to  iteration  fcj,  and  so  this  subscript  is  dropped. 
From  the  definition  of  p,  (2.13),  and  Lemma  3.3  we  get 

p||c  -  s||2  =  gTp  +  \pTH p- (-  (2A  —  p)T  (c  -  s) 

<  -f3ipTHp  +  /32\\c  -  s\\  +  (2A  -  p)T(c  -  s)  <  (fa  +  ||2A  -  p||)||c  -  s||, 

where  and  /32  are  positive  constants.  From  (3.14)  and  the  above  result  we  obtain  the 
first  bound  in  the  Lemma, 

p||c  -  *||  <  30„  +  02.  (3.31) 

If  the  penalty  parameter  needs  to  be  modified,  condition  (2.11)  cannot  hold  for  p  =  Pkt-\, 
and  (3.2)  implies 


*'(0)  =  9TP  +  (2A  -  p)T(c  -  s)  -  p||c  -  s||2  >  -\pTHp. 

It  follows  that 

gTp  +  \pT Hp  +  (2A  -  p)T(c  -  s)  >  0. 

Replacing  in  (3.32)  the  bound  for  gTp  +  \pT Hp  given  in  Le,  una  3.3  we  obtain 


(3.32) 


24 


From  condition  HC2  we  have  ||p||2  <  (1  / PsVh)pt Hp.  If  we  multiply  both  sides  of  this 
inequality  by  p  and  use  (3.33)  to  bound  pTHp,  we  obtain 


p\\p\\2  <  p 


PsvH 


pT  Hp  < 


3/3, 


PlPsvH 


p\\c- «!!  < 


3/?»(3/3M+/?a) 

PlPsvH 


where  the  last  inequality  follows  from  (3.31).  The  second  desired  bound  then  follows  from 

2p>p.  | 

We  now  extend  these  results  to  all  iterations.  To  simplify  notation,  we  shall  use  I  and 
K  to  denote  ki  and  fc/+i  respectively.  Thus,  the  penalty  parameter  is  increased  at  x,  and 
xK  in  order  to  satisfy  condition  (2.11),  and  remains  fixed  at  pj  for  iterations  /,...,  K  —  1. 


Lemma  3.10.  There  exists  a  constant  M  such  that  for  all  l, 

*1+1-1 

Pk,  £  |Kp*||2<M.  (3.34) 

k=ki 

Proof.  For  I  <  k  <  K  —  1,  property  (2.15a)  imposed  by  the  choice  of  a*,  and  the  fact 
that  the  penalty  parameter  is  not  increased,  imply  that 

<f>k  ~  <t>k+ ’  >  -Wk&k- 

Summing  these  inequalities  for  k  =  /  to  K  —  1,  0  <  a*  <  1  together  with  (3.3)  gives 

K- 1 

£  ll«*P*l!2  <  <t>i  ~  4>k ■  (3.35) 

k=l 

Consider  the  term  Pi(4>i  -  4>k)-  From  (2.2), 

p</>  =  pF  -  p\T(c  -  s)  +  \p2 (jc  -  s||2. 

This  equation,  together  with  the  boundedness  of  p/Uc,  -  s;||  and  P/||cK  -  sK||  (implied  by 
pK  >  p,  and  Lemma  3.9),  and  that  of  the  multiplier  estimates  (Lemma  2.1),  implies  that 
for  some  M\  >  0, 

Pi{<t>i  ~  4>k)  <  Afi  +  p,(F,  -  Fk).  (3.36) 

Consider  now  iterations  for  which  ||p/||  <  c,  so  that  Lemma  3.7  applies  and  p,  has  been 
obtained  as  the  minimizer  for  the  subproblem  (for  all  other  iterations  Lemma  3.9  implies 
that  pj  is  bounded,  and  the  result  follows  from  Assumption  A3,  (3.36)  and  (3.35)). 
Expanding  FK  and  cK  about  x,,  we  get 

Fk~  F,  =  (x K  -  x,)Tg,  +  0(||x,  -  xK||2)  (3.37a) 

cK  -  c,  =  A,(xk  -  x,)  +  0( ||x,  -  Xk\\2).  (3.37b) 


From  Lemma  3.8  we  have 


||x,  -  x+||  <  Mp||p,||  and  ||xK  -  x+||  <  Mp||pK||. 


(3.38) 
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As  p ,  was  obtained  as  the  solution  of  the  QP,  condition  (2.5a)  must  hold  with  multiplier 
vector  tj  >  0.  This  condition  together  with  (3.37a),  (3.37b)  and  (3.38)  implies 

Ft-  Fk  =  (c,  -ck)t-k,  +  o(max(||p,||2,||pif||2)).  (3.39) 

Using  again  (2.5), 

c,Tni  =  -ptT A1tt:1  -  -g,Tp,  -p,TH,p,. 

Since  p  is  increased  at  iteration  /,  we  must  have  that  condition  (2.11)  cannot  hold  at  that 
iteration,  implying 

<t>'i{ 0)  =  g,Tpi  +  (2A,  -  p,)T{c,  -  sj)  -  pi-i\\c,  -  s7|l2  >  -\p,T H,p,. 

The  previous  two  results  imply 

P/7T/TC/  <  -pi\piTH,p,  +  P,(2X,  -  p.fic  -  s,)  -  PiPi^\\cj  -  S/||2, 

and  this,  together  with  the  positive-definiteness  of  Hi  (condition  HC2),  the  boundedness 
of  the  multipliers  (condition  MCI  and  Lemma  2.1)  and  Lemma  3.9,  gives 

ptCiTrc,  <  p/(2A,  -  p. ,)T(c,  -  s,)  <  M2,  (3.40) 

for  some  A/2  >  0. 

Consider  now  the  term  cKT7 in  (3.39).  From  n,  >  0  we  must  have 

-PicJ;ir,  <  PiC~Tw, 

and  from  (2.9)  we  have  |)c“||  <  ||cA-  -  SkII-  Using  p,  <  pK  and  Lemma  3.9,  we  conclude 
that  there  exists  a  constant  M3  such  that 

-p,cKTn,<M3.  (3.41) 

Finally,  consider  the  third  term  on  the  right-hand  side  of  (3.39).  It  follows  from 
Lemma  3.9  and  the  relation  p,  <  pK  that  there  exists  M4  and  M 5  such  that 

P/||P/||2  <  A/4  and  Pi\\Pk\\2  <  Af5, 

and  hence  for  some  constant  Ms 

p,0(max(||p/||2,||pK||2))  <  Ms-  (3.42) 

Combining  (3.40),  (3.41)  and  (3.42),  we  obtain  the  bound 

Pi(F  1  —  F K)  <  M2  +  M3  +  Ms. 

which,  together  with  (3.36)  and  (3.35)  implies  the  desired  rei-iii'  | 

Lemma  3.11.  There  exists  a  constant  M  such  that,  for  all  k, 


Pk ||c*  -  5*11  <  M. 


(3.43) 
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Proof.  As  in  the  preceding  Lemma,  let  I  =  ki  and  K  =  fc/+i-  From  Lemma  3.9,  (3.43)  is 
immediate  for  k  =  /  and  k  =  K. 

To  verify  a  bound  for  k  =  I  +  1, . . . ,  K  —  1  we  analyze  some  intermediate  iterations  k 
and  k+  1.  We  drop  the  iteration  subscript;  also  let  quantities  evaluated  at  be  denoted 
with  a  tilde. 

From  (2.8),  pj(cj  -  sj)  =  min(p,Cj,  A,).  Consider  the  following  two  cases: 

•  I  fpicj  >  —  |Aj|,  then 

Pi\cj  -  Sj\  <  | A j | .  (3.44) 

•  Assume  now  that  pjCj  <  -|Aj|.  Expanding  the  j- th  constraint  function  around  ijt 
gives 

Cj  =  Cj  +  aajp  +  0(||ap||2). 

Rewriting  the  previous  expression,  we  obtain: 

Cj  =  (1  -  a)cj  +  a(afp  +  cj)  +  0(||ap||2).  (3.45) 

Adding  and  subtracting  (1  -  a)sj  on  the  right-hand  side  of  (3.45)  gives 

Cj  =  (1  -  a)(cj  -  Sj)  +  (1  -  a)sj  +  a(ajp  +  Cj )  +  0(||ap||2).  (3.46) 

Since  sj,  ajp  +  Cj,  a  and  1  -  a  are  all  non-negative,  we  get 

(1  —  a)sj  +  a(a,Jp  +  Cj)  >  0, 
and  using  this  bound  in  (3.46)  we  obtain 

Cj>(l-a)(cj-Sj)  +  0(\\ap\\2).  (3.47) 

Since  we  assume  p,Cj  <  —  |Aj|  we  have  Cj  =  Cj  —  Sj  <  0.  Using  this  bound  and 
1  -  a  <  1  in  (3.47)  we  get  the  following  inequality: 

-Cj  =  \£j\  =  I  Cj  -  Sj\  <  -(1  -  a)(Cj  -  Sj)  +  0(||ap||2)  <  I  Cj  -  Sj\  +  0(||ap||2). 

Multiplying  both  sides  by  p,  gives 

Pi\cj  -  Sj\  <  p,\cj  -  Sjl  +  p/0(||ap||2).  (3.48) 

For  a  given  iteration  k  <  K  -  1  and  constraint  j  we  have  one  of  the  following  two  situations: 

•  For  some  iteration  l,  I  <  l  <  k,  p,{ci)}  >  -|(Aj)j|.  If  we  add  (3.48)  for  iterations 
r  =  k  —  1,  and  use  (3.44),  we  get 

fc-l  k- 1 

Pi\(ck)j  -  (sk)j\  <  P,\(c I)j  -  (s,)j\  +  P,0(£  ||orPr||2)  <  |(A,)j|  +  p,0(£  ||arpr||2). 

r=/  r=l 

The  boundedness  of  p;j(cjt)j  -  (afc)j|  then  follows  from  Lemmas  2.1  and  3.10. 

•  For  all  iterations  l,  I  <  l  <  k  we  have  P/(c()j  <  — |(Aj)j|.  We  add  (3.48)  for  r  =  /  to 
k  —  1,  to  obtain 

fc-i 

Pl\(Ck)j  -  (s*:)j|  <  Pl\(Cl)}  -  (s/)j|  +  PtOCjT,  llarPr||2), 

T  =  l 

and  now  the  desired  result  follows  from  Lemmas  3.9  and  3.10.  | 
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Boundedness  of  a*. 

Given  the  result  of  Lemma  3.10,  all  that  is  left  to  establish  the  global  convergence  of  the 
algorithm  is  to  show  that  the  steplength  is  bounded  away  from  zero.  As  a  consequence  of 
the  weak  assumptions  imposed  on  the  multiplier  estimate  /x*,  it  is  not  possible  to  show  that 
such  a  bound  exists.  However,  it  can  be  proved  that  the  bound  does  exist  if  there  is  no 
subsequence  along  which  ||pjt||  — *  0.  This  is  enough  to  prove  convergence. 

We  first  derive  a  bound  on  the  norm  of  the  second  derivative  along  the  linesearch. 

Lemma  3.12.  For  0  <  6  <  a*,  there  exists  a  positive  constant  N  such  that 

m  <  i v. 

Proof.  We  again  drop  the  subscript  k.  From  (3.1), 

(  V2F  -  EA  -  P(cj  ~  sj))V2Cj  +  PATA  -At  ~pAt  \ 

V2Z.=  -A  0  I  . 

\  ~PA  I  pi  ) 

From  the  definition  of  <f> ,  given  in  (2.2),  we  get 

4>'\e)  =  pTwP  +  E;p(<;  W  -  ^(0))ptv2Cj(0)p 

+  P(A(0)p  -  qf(A(9)p  -q)~  2 (r{A{9)p  -  9),  (3.49) 

where  the  argument  9  denotes  quantities  evaluated  at  x  +  9p,  except  for  s(9)  =  s  +  9q  and 

W  =  V2FW-EJ(Ai  +  ^)V2cJW. 

We  now  derive  bounds  on  the  terms  on  the  right-hand  side  of  (3.49).  For  the  first  term 
we  can  write 

pTWp  <  iVi ||p2||  <  Mj,  (3.50) 

for  some  constant  M\,  using  assumption  A3,  the  boundedness  of  ||A||  and  ||£||  (condition 
MCI  and  Lemma  2.1),  and  the  boundedness  of  ||p|l  (Lemma  3.5). 

Expanding  Cj  in  a  Taylor  series  about  x  gives 

Cj(9)  =  Cj(x )  +  9aj{x)Tp  4-  i02pTV2Cj(0j)p, 

where  0  <  9j  <  0.  Using  (2.10)  and  multiplying  both  sides  by  p  gives 

p(cj(0)  ~  =  P(  1  -  9){cj(x)  ~  +  P^2PTV2ci(6»J)p. 

Lemma  3.11  implies  that  p|cj(i)  -  Sj|  is  bounded,  Lemma  3.10  implies  that  p||0p||2  is 
bounded  for  9  <  a,  and  assumption  A3  implies  that  |iV2Cj(0j)||  is  also  bounded.  Conse¬ 
quently, 

Pl(cj(0)  ~  -Sj(0))|  <  N, 

where  A  is  a  constant.  This  result  and  Lemma  3.5  imply  the  second  term  in  (3.49)  is  also 
bounded,  that  is, 

£| P(c,(0)  -  SiW^V^WpI  <  A2||p||2  <  m2, 

3 


(3.51) 
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where  N2  and  M2  are  constants. 

Consider  now  p\\A{6),p  —  g||2,  the  third  term  on  the  right-hand  side  of  (3.49).  Using 
Taylor  series,  we  have 

aj(x  +  dp)Tp  =  ajp  +  8pT\72cJ(d])p,  (3.52) 

where  0  <  9j  <  6.  From  (2.10)  and  Lemmas  3.10  and  3.11,  we  obtain 

p\\A(6)p-q\\2<M3,  (3.53) 


where  M3  is  a  constant. 

From  (3.52),  (2.10),  assumption  A3  and  the  boundedness  of  ||£||  (Lemma  2.1),  the  final 
term  on  the  right-hand  side  of  (3.49)  is  also  bounded, 

-2?(A(6)p-q)  =  -2 tT(Ap-q)  +  '£itj0prV2cJ(8j)p  <  2£T(c-s)  +  iV4||p||2  <  M4,  (3.54) 

3 

where  N4  and  M4  are  constants. 

The  desired  bound  follows  from  (3.49),  (3.50),  (3.51),  (3.53)  and  (3.54).  | 

Lemma  3.13.  For  any  e  >  0,  i/||pjt||  >  e  there  exists  a  value  d(e)  such  that  a*,  >  a(c)  >  0, 
where  ak  is  the  steplength  computed  by  the  algorithm. 


Proof.  We  drop  the  subscript  k  corresponding  to  the  iteration  number.  We  start  by 
proving  that  a  (as  defined  in  (2.14)  and  (2.15))  is  bounded  away  from  zero  if  ||p||  >  e.  If 
condition  (2.14)  is  satisfied  at  a  given  iteration,  then  d  =  1,  trivially  bounded  away  from 
zero.  We  assume  therefore  that  d  is  chosen  to  satisfy  (2.15). 

In  the  proof  of  Lemma  3.1  it  was  shown  that  the  linesearch  procedure  was  well  defined, 
and  in  particular,  that  there  exists  a  value  a  E  (0, 1]  satisfying  (2.15)  and  such  that  condition 
(2.15b)  is  not  satisfied  for  any  value  of  a  €  [0,a);  see  (3.5),  (3.7)  and  (3.6). 

From  the  Taylor  series  expansion  of  4>'  at  d  we  have 

=  <^(0)  +  6t<j>"{8), 


where  0  <  9  <  a.  Therefore,  using  (3.5)  and  noting  that  7?  <  1  and  <^'(0)  <  0,  we  obtain 


a  = 


V'{9)  K  ’  f(0)  ' 


(3.55) 


(Since  a  >  0,  6  must  be  such  that  <f>"(9)  >  0.) 

If  ||p||  >  e,  then  from  (3.3)  we  have  that  |<£'(0)|  >  ^(3svh€ 
also  have  <  N ,  implying 


a  > 


fisvH  2 

~2N~( 


2 


and  from  Lemma  3.12  we 


If  condition  (2.16)  is  satisfied  for  a,  then  the  previous  bound  holds  for  a.  Otherwise,  for 
some  constraint  j  we  must  have  hj(a )  =  Cj(x  +  dp)  +  0C  <  0  (using  the  notation  introduced 
in  Lemma  3.1).  If  hj( 0)  >  >  0,  from  the  continuity  of  h  there  exists  a  value  a  <  d  such 

that  hj(a)  =  0  and  hj(a )  >  0  for  all  a  E  [0,  d].  From  the  mean-value  theorem 


_  hj(a)  -  hj(0)  hj(0) 

h'}(6)  \hr(6)\' 
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for  some  9  €  [0,a].  But  as  |/i'(0)|  =  j aj(x  +  9p)Tp\  <  K  for  some  K  >  0  (from  assumption 
A3  and  the  boundedness  of  ||p||,  Lemma  3.5),  we  have 


(3.56) 


If  hj(0)  <  \(5C,  we  must  have  from  (2.4b), 

h'ji 0)  =  ajp  >  -Cj  =  0c-  hj(0)  >  \pc. 


From  hj{ 0)  >  0  and  hj(a)  <  0  there  must  exist  a  value  a  <  a  such  that  h'-(a)  <  0,  implying 
the  existence  of  a  <  a  such  that  h'-(a )  =  0  and  hj(a)  >  0  for  all  a  €  [0,a]  (also,  hj(a)  >  0 
for  all  a  [0,a]).  From  the  mean- value  theorem, 

.  h',(a)  --  h'-( 0)  /<•( 0) 

Q  h'j(0)  \h"{9)\' 


for  some  6  G  [0,  a].  But  hj( 0)  >  \/3c,  and  |^"(®)|  =  | pTV2Cj(x  +  9p)p\  <  K  for  some  K  >  0, 
from  assumption  A3  and  the  boundedness  of  ||p||,  Lemma  3.5,  implying  again 


(3.57) 


The  procedure  to  construct  a  will  ensure  that  a  >  |a,  and  so  the  result  presented  in 
the  Lemma  will  hold.  | 

We  can  now  prove  the  global  convergence  theorem  for  the  algorithm. 


Theorem  3.1.  The  sequence  {x/t}  generated  by  the  algorithm  converges  to  a  unique  KKT 
point  for  NP. 


Proof. 

It  follows  from  Lemma  3.8  that  to  prove  ||xj.  —  x*||  — >  0,  it  is  sufficient  to  show 

lim  ||pjfc||  -+  0.  (3.58) 

k—*oo 

If  (3.58)  is  true  then  there  exists  K  such  that  |jxfc  -  x*||  <  e*/2  and  ||pjfe||  <  e*  for  all  k  >  K, 
where  2e*  is  the  minimum  distance  between  two  KKT  points.  It  follows  that  z*  is  unique 
for  k  >  K  (the  sequence  converges  to  the  unique  KKT  point  nearest  to  zK),  otherwise  it 
implies  that  for  some  k  >  K  that  either  ||xfc  -  x*||  >  e*/2  or  ||pfc||  >  e* .  Consequently,  to 
prove  the  theorem  it  is  sufficient  to  show  (3.58)  is  true. 

If  ||pk|j  =  0  for  any  fc,  the  algorithm  terminates  and  the  theorem  is  true.  Hence  we 
assume  that  ||pfc||  ^  0  for  any  k.  If  p*  -f*  0,  there  must  exist  a  subsequence  {pi},  and  a 
positive  constant  c,  such  that  ||p/||  >  e  for  all  /.  In  this  case,  from  Lemma  3.13  there  will 
exist  a  uniform  lower  bound  on  a;,  a;  >  a  >  0,  but  then 

P/||a/P/||  >  dtepi  -*•  °o, 

contradicting  the  fact  that  Pfc||or*Pfc||  is  bounded  (Lemma  3.10). 
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In  the  bounded  case,  we  know  that  there  exists  a  value  p  and  an  iteration  index  K  such 
that  p  —  p  for  all  k  >  K.  Again,  the  proof  is  by  contradiction.  Consider  only  indices  l  such 
that  l  >  K.  Every  such  iteration  after  K  must  yield  a  strict  decrease  in  the  merit  function 
because  the  termination  condition  for  the  linesearch  (2.15a),  together  with  the  boundedness 
of  the  steplength  (from  Lemma  3.13  and  j|pjj|  >  e)  and  (3.3)  imply 

<f>l(ai)  ~  <M°)  <  <  -\aa^svH\\pi\\2  <  0. 

The  adjustment  of  the  slack  variables  s  in  (2.7)  can  only  lead  to  a  further  reduction  in  the 
merit  function,  as  LA  is  quadratic  in  s  and  the  minimizer  with  respect  to  Sj  is  given  by 
cj  —  Xj/p.  From  the  fact  that  the  penalty  parameter  is  not  modified,  for  iterations  from  the 
subsequence  we  have 

4>(xi+ 1)  -  4>(x\ )  <  -Xaa(3svHe2. 

Therefore,  since  the  merit  function  with  p  =  p  decreases  by  at  least  a  fixed  quantity  at 
every  step  in  the  subsequence,  it  must  be  unbounded  below,  contradicting  (3.18).  It  follows 
that  (3.58)  must  hold.  | 

Having  established  the  global  convergence  of  the  algorithm,  the  next  step  is  to  show  that 
the  multiplier  estimate  A*  —*  X*.  In  order  to  prove  this  result,  we  need  to  strengthen  our 
conditions  on  the  multiplier  estimate  pk  (if  Pk  does  not  converge  then  A*  will  not  converge 
either).  The  additional  condition  is 

MC3.  ||/Xfc  -  A* ||  =  0(||a;fc  -  x*||),  where  A*  denotes  any  multiplier  vector  associated  with 
a  KKT  point  closest  to  Xk. 

This  condition  requires  that  /3M  in  condition  MCI  be  chosen  so  that 

>  ||A*||.  (3.59) 

Estimates  satisfying  MCI,  MC2  and  MC3  may  be  obtained  by  computing  a  multiplier 
for  the  “active”  constraints  (say,  least-squares  estimates  of  least  length),  and  expanding  to 
the  full  multiplier  space  by  augmenting  this  vector  with  zeros  corresponding  to  the  inactive 
constraints.  If  such  an  estimate  does  not  satisfy  MCI,  then  a  suitable  estimate  may  be 
determined  by  appropriate  scaling.  The  multipliers  at  the  stationary  point  of  the  QP  also 
satisfy  the  three  conditions.  Note  that  if  x*  is  not  unique  then  from  Lemma  3.6,  ||xfc-x*  ||  >  £ 
for  some  e  >  0,  and  MC3  holds  for  any  vector  pk  that  is  bounded. 

We  first  show  that  under  the  stronger  conditions  on  pk  the  steplength  a*  is  uniformly 
bounded  away  from  zero. 

Lemma  3.14.  Under  MC3  and  all  earlier  assumptions  and  conditions,  ak  >  a  >  0. 

Proof.  We  again  drop  the  subscript  k.  We  first  tighten  the  bound  on  <t>"(0)  given  in 
Lemma  3.12.  From  (3.50)  and  (3.51),  we  have  that  the  first  two  terms  on  the  right-hand 
side  of  (3.49)  are  bounded  by  a  multiple  of  ||p||2.  From  (3.52),  (2.10),  (3.54)  and  Lemmas  3.10 
and  3.11  we  may  obtain  the  following  bound  on  the  remaining  terms  on  the  right-hand  side 
of  (3.49) 

p  ( A{6)p  -  q)T(A(0)p  -q)-  2 £T{A(0)p  -q)<p(c-  s)T{c  -  s)  +  2£r(c  -  s)  +  M||p||2,  (3.60) 
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(3.61) 


for  some  constant  M. 

Observe  that  from  (3.2)  and  (2.4b), 

p(c-  s)T(c  -  s)  +  2 <fT(c  -  s)  =  -<f>'(0)  +  pTg  +  pT(c  -  s ) 

=  -(£'(0)  +  pT(g  -  Arp)  -  pTs. 

Using  Taylor  expansions  and  Lemma  3.8  it  follows  that 

pT(g  -  ATp)  =  pT(g*  -  A*Tp)  +  0(||p||2)  =  (A*  -  p)TA*p  +  0(\\p\\2). 

From  this  result  and  MC3  there  exists  a  constant  M  such  that 

pT(g  -  ATp)  <  M\\p\\2. 

From  pk  A*,  strict  complementarity  at  a  KKT  point  (assumption  A6),  and  the  fact 
that  the  correct  active  set  is  identified  for  ||p||  small  enough  (Lemma  3.7),  we  eventually 
have  p  >  0  and  pTs  >  0.  Consequently,  it  follows  from  (3.49),  (3.50),  (3.51),  (3.60),  (3.61) 
and  (3.62)  that 

for  some  constant  N  >  0.  This  result  and  (3.3)  can  be  used  with  (3.55)  to  imply  that  there 
exists  a  value  a  satisfying  (2.15)  such  that 

PsvH\\p2\\  ,,  ,  fisvH 


(3.62) 


«>(1  -T})- 


=  (i-vh 


>0, 


(PsvH  +  2iV)||p2||  v*  ,u(PsvH  +  2N) 

The  desired  result  then  follows  from  an  argument  identical  to  that  given  in  the  final  part 
of  Lemma  3.13.  | 

This  lemma  also  implies  that  the  effort  needed  to  compute  the  value  for  the  steplength 
is  uniformly  bounded  in  the  algorithm.  We  now  establish  the  convergence  of  the  multiplier 
estimate. 

Theorem  3.2.  Under  MC3  and  all  other  assumptions  and  conditions, 


Proof.  From  (2.27), 


lim  Afc  =  A *. 
fc— *  oo 


Afc+i  —  y]  ijkPit 

3=0 


where 


Ikk  =  ak, 


(3.63) 

(3.64) 


nk  =  ^  na-  Qr)>  1  < 

r  =  i+l 

with  a(,  =  1  and  aj  =  cq,  l  >  1.  (This  convention  is  used  because  of  the  special  initial 
condition  that  Ao  =  po-)  From  Lemma  3.14  and  (3.64),  we  observe  that 


0  <  a  <  a\  <  1  for  all  /, 
k 

=  1, 

1=0 

Ilk  <  (1  -  a)k~l,  l  <  k. 


(3.65a) 

(3.65b) 


(3.65c) 
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From  condition  MC3  we  have 


M*  =  A*  - f  Mktkh , 


(3.66) 


with  |M*|  <  M,  6k  =  Hz*;  -  x*||  and  ||tfe||  =  1.  From  Theorem  3.1,  for  any  e  >  0  we  can 
choose  a  value  K\  so  that,  for  k  >  K\, 

\MkSk\  <  je.  (3.67) 

Given  any  e  >  0,  we  can  also  define  an  iteration  index  Ki  with  the  following  property: 


(3.68) 


2(*  +  1)(1  +  2/J„) 

for  k  >  K-i  +  1.  Let  K  =  max(Jifi,  K2).  Then,  from  (3.63)  and  (3.66),  we  have  for  k  >  2K, 

K  k 

A*+i  =  7i*(^*  +  MiSiti). 

1=0  i=k+ 1 

Hence  it  follows  from  (3.65b)  that 

K  k 

Afc+i  -  A*  =  5^7 ik(Hi  -  A*)  +  ^2  JikMiSiti. 
l=o  l=K+i 

From  the  bounds  on  |)/z/||  (condition  MCI),  ||tj||,  and  (3.59),  we  obtain 

K  k 

||Aa:+i  -  A*H  <  +  ^2  llk\M\6i\.  (3.69) 

1=0  l=K+l 

Since  we  assume  k  >  2 K,  it  follows  from  (3.65a)  and  (3.65c)  that 
K  K  K 

Em  <  E(1  -  a)fe~'  <  E(1  -  <(K+  1)(1  -  af. 

1=0  1=0  1=0 

Using  (3.68),  we  thus  obtain  the  following  bound  for  the  first  term  on  the  right-hand  side 
of  (3.69): 


K 


20nJ2llk  -  26- 

l=o 

To  bound  the  second  term  in  (3.69),  we  use  (3.65b)  and  (3.67): 


(3.70) 


J2  7l*|  Wl  <  ^2  7(*  ^  2€- 

l=K+ 1  l=K+ 1 


(3.71) 


Combining  (3.69)-(3.71),  we  obtain  the  following  result:  given  any  t  >  0,  we  can  find  K 
such  that 

||Afe  -  A* ||  <  €  for  k  >  2K  +  1, 
which  implies  the  desired  result.  | 
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4.  Rate  of  convergence 

In  this  Section  we  shall  show  under  additional  assumptions  on  the  multiplier  estimate  that 
the  algorithm  converges  at  a  superlinear  rate,  independently  of  the  asymptotic  behavior  of 
the  penalty  parameter. 

Since  pk  —*■  0,  we  may  assume  without  loss  of  generality  that  pk  has  been  obtained  as 
the  minimizer  for  the  QP  subproblem,  and  that  the  correct  active  set  has  been  identified. 
We  again  start  by  presenting  an  outline  of  the  steps  taken. 

•  Bounds  on  the  rate  of  growth  of  the  penalty  parameter  introduced  in  Lemmas  3.9, 
3.10  and  3.11  are  tightened. 

-  In  Lemma  4.1  we  prove  that  ai  i  iterations  at  which  pk  is  increased  (if  we  have 
an  infinite  sequence  of  such  iterations) 

Pk\\ck  ~  sjfell  -*•  0  and  p*||p*||2  -*■  0. 

-  In  Lemma  4.2  and  Lemma  4.3  these  results  are  extended  to  all  iterations. 

•  In  Lemma  4.4  it  is  shown  that  pk Sk  =  0  for  sufficiently  large  k. 

•  Lemma  4.5  proves  the  superlinear  convergence  of  the  sequence  { Xk  +  Pk  ~  x*},  under 
certain  assumptions  on  Hk. 

•  For  k  sufficiently  large,  a*  =  1. 

-  Lemma  4.6  gives  the  relationship  between  the  descent  in  one  iteration  <^t(l)  — 
0jb(O)  and  the  initial  derivative  in  the  linesearch  <£*.(0). 

-  Theorem  4.1  shows  that  ajt  =  1  for  all  sufficiently  large  k,  implying  superlinear 
convergence. 

•  Finally,  Theorem  4.2  shows  that  under  an  additional  condition  on  the  multipliers,  the 
penalty  parameter  remains  bounded. 

The  first  two  Lemmas  introduce  refinements  on  the  results  presented  in  Lemmas  3.9, 
3.10  and  3.11,  and  their  proofs  are  based  on  the  corresponding  proofs  for  these  Lemmas. 

Lemma  4.1.  If  ki  — »  oo,  where  ki  denotes  an  iteration  at  which  the  penalty  parameter  is 
increased,  then 

lim  pjt,||cfc(  -  Sfc,||  =  0  and  lim  p*,||p*,||2  =  0. 

I — ►OO  (—*00 

Proof.  We  drop  the  subscript  ki  in  what  follows. 

Since  p  is  the  minimizer  of  QP,  condition  (2.5a)  holds  for  a  nonnegative  vector  x.  From 
(2.4b)  and  (2.5a)  we  have  gTp  +  ^ pT Hp  =  -irTc  and  using  this  result  in  the  definition  of 
p,  (2.13), 

p\\c  -  s||2  =  -\pT  Hp  -f  (2A  -  p  -  ff)T(c  -  s)  -  nTs  <  ||2A  -  p  -  x||||c  -  s||. 

From  (2.12)  we  have  p  <  2/3,  and  using  Theorem  3.2,  MC3  and  — 1 >  A*  we  obtain 
lim  pk,  ||c/t,  -  ||  <  2  lim  ||2A*,  -  pk,  -  **,11  =  0. 

Finally,  from  (3.33)  and  (4.1)  we  have  limi_00  Pk,\\Pk,\\2  =  0,  completing  the  proof.  | 


(4.1) 
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Lemma  4.2.  For  general  iterations  k,  lim*_oo  pifc||pfc||2  —  0. 

Proof.  Define  I  =  k\  and  K  = 

If  p  is  bounded,  the  result  follows  from  Theorem  3.1.  If  p  is  increased  in  an  infinite 
number  of  iterations,  from  (3.35)  and  Lemma  3.13  we  only  need  to  show  that  <f>,  -  <f>K  — ►  0. 

From  the  boundedness  of  ||A*||  (Lemma  2.1),  Lemma  4.1  and  the  fact  that  p,  <  pK,  we 
have 


Pl\^JT{Cl  ~  5/)|  ^  2/);||A/||  ||C/  —  Sj  ||  — >  0, 

Pi\^kT(ck  ~  5k)|  ^  2/>/c|| A/dl  ||cK-  —  s^||  — *  0. 

We  also  have  from  Le.uma  4.1, 

Pi2\\ci  -  5/II2  -*  0,  Pi2\\ck  -  Sifll2  -*  0. 

These  results  and  the  definition  of  <£,  (2.2),  imply 

Pi(4>i  —  <!>k)  —  Pi{F  i  —  F  k)  —■ *  0*  (4-2) 

We  now  analyze  the  asymptotic  behavior  of  the  term  pi{Fj  -  F K).  We  have 

Fj-  Fk  =  ( c ,  -  cK)rn r,  +  o(max(||p,||2,  ||Pk||2)). 

Using  the  same  arguments  as  in  the  proof  of  Lemma  3.10,  inequality  (3.40)  also  holds  in 


this  case,  and  from  (3.14), 

pi*iTc,  <  pj\\c,  -  Sj || ||2A/  -  p,\\  <  3/?Mp/||c;  -  s,||.  (4.3) 

A  second  bound  for  this  term  can  be  obtained  from  ttj  >  0  and  s7  >  0,  implying 

PiKiTc,  >  piTt,T(cj  -  s,)  >  -p/|K||||c,  -  s/||.  (4.4) 

Since  Htt/U  is  bounded,  it  follows  from  applying  Lemma  4.1  to  (4.3)  and  (4.4)  that 

PiViTc,  -»  0.  (4.5) 

From  (2.9),  the  boundedness  of  ||7r/||  and  Lemma  4.1, 

-  P/CKT7T;  <  p,ck-t X,  <  p^itMck  -  «ic ||  -»  0-  (4-6) 

We  can  again  use  Lemma  4.1  to  obtain 

p/o(max(||p/||2,||pK||2))  ->0.  (4.7) 


From  (3.39),  (4.5),  (4.6)  and  (4.7)  we  have  that  the  sequence  {pi(Ft  -  F K)}  is  bounded 
above  by  a  sequence  that  converges  to  zero.  It  then  follows  from  <£/  -  4>k  >  0  and  (4.2)  that 
p,{4>,  -  <f>K)  — ►  0  and  the  desired  result  follows  from  (3.35)  and  Lemma  3.14.  | 


Lemma  4.3.  For  general  iterations  k,  linu-Kx,  Pk\\ck  -  s*i|  =  0. 
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Proof.  If  p  is  bounded  the  result  follows  from  c*  >  0,  A*  >  0,  A *Tc*  =  0,  Theorems  3.1 
and  3.2  and  (2.8). 

We  assume  therefore  that  p  is  increased  an  infinite  number  of  times.  Consider  two  cases: 

•  If  constraint  j  is  such  that  c*  >  0,  then  A*  =  0  and  from  (2.8), 

p\cj  -  s;|  =  |  mln(pcj,  Aj)|, 

but  from  Theorem  3.2  and  assumption  A6,  eventually  \j  <  pcj,  implying 

p\Cj  -Sj\  =  (Ajl  ->  0. 

•  For  those  j  such  that  c*  =  0,  implying  A*  >  0,  consider  iteration  indices  large  enough 
that  the  correct  active  set  is  identified  (Lemma  3.7),  implying  ajp  +  cj  =  0.  From  the 
Taylor  series  expansion  for  Cj  and  the  boundedness  of  the  steplength, 

Cj{xk  +  otkPk)  =  Cj(zfc)  +  <Xk(ak)Jpk  +  0(||ajfcPjfc||2)  =  (1  -  ak)cj(xk)  +  0(||pfc||2), 
Recurring  this  relationship  for  k,  I  <  k  <  K,  we  get 

fe-i  k- l 

Pk(ck)j  =  Pi{ck)j  =  p,  JJ(1  -  ai)(Cl)j  +  ptO(j2  IIP/112), 

/=/  /=/ 

but  as  0  <  at  <  1  we  must  have 

Pk\{ck)j\  <  Pi\(c,)j\  +  PjO(J2  INI2)-  (4-8) 

1=1 


From  c*  =  0,  assumption  A6  and  (2.8),  eventually  it  must  hold  that  Pi\{c,)j  —  (s/)j|  = 
pi\c(i)j\,  and  using  Lemma  4.1,  (4.8)  and  Lemma  4.2, 

Pk\(ck)j  |  -*•  0. 

From  this  result,  definition  (2.8),  assumption  A6  and  Theorem  3.2,  for  k  large  enough 
Pk\(ck)j  ~  (sk)j\  =  |  min(pfc(cfc)j,  (Afc)j)|  =  \pk(ck)j\ 0. 

This  completes  the  proof.  | 

Lemma  4.4.  For  k  large  enough  p%Sk  =  0. 

Proof.  If  constraint  j  is  such  that  c*  >  0,  then  for  k  large  enough  ( Ck)j  >  e  >  0,  and 
( ak)Jpk  +  (cfc)j  >  je  >  0.  It  therefore  follows  from  MC2  that  (p.k)j  =  0. 

If  j  is  such  that  c*  =  0,  then  from  assumption  A6,  A*  >  0.  Also,  from  Lemma  4.3, 
pk((ck)j  -  (sfc)j)  =  min(pjt(cjt)j,  (A*)j)  -♦  0,  and  for  large  enough  k  Theorem  3.2  will  imply 
Pk(ck^i  <  (Ajt)j ;  these  two  results  and  definition  (2.7)  imply 

(sk)j  =  maxfo,(c)t)j  -  =  0, 

x  Pk  ' 


completing  the  result.  | 

To  prove  that  the  algorithm  converges  superlinearly  it  is  necessary  to  assume  that  Hk 
converges  to  an  approximation  of  V%.xL(x*,  A*)  in  some  sense,  where  L(x,  A)  denotes  the 
Lagrangian  function  for  problem  NP. 

Define  Wk  as 

Wk  =  V2xxL{xk,  Afc)  =  VlxF(xk)  -  (4.9) 

3 

We  impose  the  following  additional  condition  on  Hk: 

HC3.  Following  Boggs,  Tolle  and  Wang  [BTW82],  we  assume 

\\ZZ(Hk-Wk)pk\\  =  o(\\pk\\), 

where  Zk  is  a  basis  for  the  null  space  of  Ak ,  the  Jacobian  of  x k  of  those  constraints 
active  at  x* ,  that  is  bounded  in  norm  and  has  its  smallest  singular  value  bounded 
away  from  0. 

The  proof  proceeds  by  first  showing  that  the  sequence  {a:*  +  pk  —  a:*}  converges  super¬ 
linearly,  and  then  proving  that  a  steplength  of  one  is  eventually  attained. 

The  following  lemma  corresponds  to  Theorem  3.1  in  [BTW82]. 

Lemma  4.5.  Under  assumptions  A1-A7,  and  conditions  MC1-MC3,  HC1-HC3, 

II**  +  pk  -  1*11  =  o(||xfc  -  a:* ||).  (4.10) 

The  results  presented  on  bounds  for  the  growth  rate  of  the  penalty  parameter  allow  us 
to  obtain  an  asymptotic  expansion  for  the  quantities  involved  in  the  linesearch  termination 
criterion.  We  want  to  prove  that  condition  (2.14)  is  satisfied  for  k  sufficiently  large.  It 
is  shown  in  the  following  lemma  that  the  satisfaction  of  (2.14)  is  directly  related  to  the 
asymptotic  properties  of  Tk  =  pk(gk  -  Aftik)  +  p]\Vkpk. 

Lemma  4.6.  The  following  relationship  holds: 

<M1)  -  M  0)  =  K(  °)  +  \Tk  +  o(||rf). 

Proof.  In  the  proof  we  drop  the  subscript  k,  and  we  denote  quantities  associated  with 
xk  +  Pk  by  a  tilde,  that  is,  F  =  F(xk  +  pi t)  while  F  =  F(xk). 

From  the  definition  of  the  merit  function  (2.2)  and  (2.1)  we  have 

<£(1)  -  4>(0)  =  F-  F-pT(c-s-q)  +  A  T(c  -  s) 

+  f(5~  S~  q^~s  “9)“  2 (c  ~  S)T(C  "  (411) 

From  the  Taylor  series  expansion  of  c  around  x  and  (2.10)  we  have 

Cj  ~  SJ  ~  9j  =  Cj  -  cj  -ajp=  \pTV2c}p  -(-  o(|)p||2), 
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and  using  this  result  with  the  Taylor  expansions  for  c  and  F  in  (4.11)  we  obtain 

<f>(l)  -  <j)(0)  =  gTp  +  \pT^2Fp  -  PTV2Cjp  +  A T(c  -  s) 

+  ,(pTV2Cjp)2  -  |(c  -  s)T(c  -  s)  +  o(||p||2).  (4.12) 

From  (2.6),  condition  MC3  and  Theorem  3.2  we  have 

p  =  A  +  £  =  A  +  o(l).  (4.13) 

Also,  from  Lemma  4.2  and  assumption  A3  we  have  ppTV2Cjp  =  o(l).  Replacing  these 
results  in  (4.12)  and  reordering  the  terms  we  obtain 

<£(1)  “  <£(°)  =  9TP  +  \pTV2Fp  -  ^EjAj  prV2Cjp  +  i(2A  -  p)T(c  -  s) 

+  i/(c  -  *)  -  |(c  -  s)T(c  -  s)  +  o(||p||2). 

Using  (4.9)  and  (3.2)  to  simplify  this  expression, 

<£(1)-  /  =  ^'(°)  +  l(ffTP  +  pTWp  +  pT(c- s))  +o(||p||2).  (4.14) 

Finally,  from  condition  MC2  we  have  {Pc  =  -{iTAp,  and  from  Lemma  4.4  we  know  that 
eventually  fiTs  =  0,  implying  in  particular  that  pTs  =  o(||p||2),  and  replacing  these  bounds 
in  (4.14)  we  have 

<£(1)  -  4K°)  =  2^'(°)  +  2  (pTWp  +  pT(g  -  ATp))  +  o(||p||2), 
completing  the  result.  | 

The  main  result  of  this  section  is  given  in  the  next  theorem.  It  is  shown  that,  if  condition 
MC3  is  replaced  by  a  stronger  condition,  then  after  a  finite  number  of  iterations  a  steplength 
of  one  is  taken  for  all  iterations  thereafter,  implying  that  the  algorithm  achieves  superlinear 
convergence.  The  new  condition  is 

MC3\  \\pk  -  A*||  =  o(||xfc  —  i*||). 

It  is  possible  to  prove  superlinear  convergence  without  the  need  to  strengthen  the  con¬ 
ditions  on  the  multipliers.  It  is  shown  in  [Pr89]  that  there  exists  a  constant  M  such  that  if 
pk  >  M,  condition  MC3  is  sufficient. 

Theorem  4.1.  //MC3’  and  all  other  assumptions  and  conditions  hold  then  eventually  a 
unit  step  is  always  taken  and  the  algorithm  converges  superlinearly. 

Proof.  As  in  Powell  and  Yuan  [PY86],  observe  that  the  continuity  of  second  derivatives 
gives  the  following  relationships: 

F(xk  +  pk)  =  F(xk)+  ^g(xk)  +  g(xkFpk)^j  pk  +  o(\\pk\\2) 
c(xk  +  pk)  =  c(xk)  +  ^(.4(xfc)  +  A(ijt  +pkj)pk  +  o(||Pfc||2)- 


(4.15) 


38 


From  the  Ta>  lor  series  expansions  we  have 

F(xk  +  pk)  =  F(xk)  +  g(xk)TPk  +  +  °(MI2)  (a 

Cj(xk  +  Pk)  =  Cj(xk)  +  aj(xk)Tpk  +  \pV^2ci{xk)Pk  4-  o(||p*|)2), 

and  since  (4.10)  and  Lemma  3.8  imply  g(xk+pk)  =  5*4-o(||pfc||),  a^Xk+Pk)  =  a*+o(||pt||), 
we  get  from  (4.15)  and  (4.16)  that  (we  drop  the  subscript  k) 

pTV2Fp  =  (g*  -  gfp  +  o(||p||2)  (4.17a) 

pTV2c_,p  =  (a*  -  ajfp  +  o(|(p||2).  (4.17b) 

Condition  MC3,  Theorem  3.2  and  (4.13)  give  Ylj^jPT^2ciP  =  X),  Pj  PT^'2CjP  +  o(||p||2), 
and  if  we  apply  this  bound  to  the  result  of  adding  (4.17a)  to  (4.17b)  multiplied  by  A j,  we 
have 

pTWp  =  pT(g*  -  A*\)  -  pT(g  -  ATp)  4-  o(||p||2).  (4.18) 

Condition  MC3’,  (1.1)  and  Lemma  3.8  imply 

pT(g*  -  A*T p)  =  pTA*T(X*  -p)  =  o(||p||2), 

and  from  (4.18), 

T  =  pTWp  4-  pT{g  -  ATp)  =  pT(g*  -  A*Tp)  4-  o(||p||2)  =  o(||p||2).  (4.19) 

From  Lemma  4.6  and  (4.19)  we  get 

Since  4>'(0)  <  0,  the  above  relationship  and  Theorem  3.1  imply  that  condition  (2.14)  is 
eventually  satisfied  for  k  sufficiently  large. 

Regarding  condition  (2.16),  we  can  use  Taylor  series  expansions  for  cj  to  write 

cj{xk  4-  Pk)  =  Cj(x k)  4-  Oj(xfc  4-  0jPk)TPk  (4.20) 

for  some  0j  £  [0, 1],  and 

a3{xk  4-  Q3Pk)TPk  =  a3{xh )T Pk  4-  PjfcV2Cj(xfc  4-  &jPk)pk,  (4.21) 

for  d3  £  [0, 6j]. 

Using  Theorem  3.1  and  the  boundedness  of  ||V2cJ(ifc  4-  ^>Pfc)||  (from  Assumption  A3 
and  Lemma  3.4)  in  (4.21),  for  k  large  enough 

aj(xk  4-  0jPk)TPk  >  aj{xk)T pk  ~  \&c, 

and  from  (2.4b), 

aj(xk  4-  0jpk)Tpk  >  a3{xk)Tpk  -  \PC  >  ~Cj(xk)  -  \f}c- 

Replacing  this  bound  in  (4.20),  we  obtain  for  all  k  large  enough  c(xk  4-  pk)  >  —  ,  and 

condition  (2.16)  will  also  be  satisfied,  giving  xk+\  =  xk  4-  Pk-  The  required  result  then 
follows  from  Lemma  4.5.  | 
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Boundedness  of  the  penalty  parameter 

The  last  result  in  this  section  shows  that,  if  condition  MC3’  is  replaced  by  a  slightly 
stronger  condition,  the  penalty  parameter  needs  to  be  modified  in  at  most  a  finite  number  of 
iterations  (and  consequently  it  remains  bounded).  The  criterion  presented  will  be  satisfied, 
for  example,  by  the  least-squares  multipliers  computed  at  xk  +  Pk- 

Theorem  4.2.  If  the  multiplier  estimates  pk  in  the  algorithm  satisfy 

-  A* ||  =  0{\\xic  +  pk  -  z*||),  (4.22) 

and  all  other  assumptions  and  conditions  hold  then  there  exists  a  constant  M  such  that 
Pk  <  Af  for  all  k. 

Proof.  We  may  assume  k  large  enough  so  that  =  1.  From  (2.5),  (2.4b)  and  njsk  >  0, 
we  have 

9kPk  +  pjHkPk  =  P^Ak-Kk  =  -cjfcVfc  <  ~(cfc  -  sk)Txk,  (4.23) 

where  Xk  denotes  the  QP  multipliers  at  iteration  k.  From  (3.2),  (4.23)  and  the  fact  that  a 
unit  steplength  is  accepted,  it  follows  that 

4>'k(0)  =<  -PkHkPk  +  \\2pk-i  -  Pk-  a-fcllllc*  ~  Sill  -  Pk\\ck  -  Sill2-  (4.24) 
From  (4.22),  HC2,  Lemma  3.8  and  ||trjt  -  A*||  =  0(||pi||)  we  must  have 

pMi-i  ~  Pk-  Will  <  Mi  ||pfc||  <  M2\Jp{HkPk 
for  some  positive  constants  M\,M2.  It  then  follows  using  a2  +  b2  >  2 ab  that 

\\2pk-i  -  pk-  Wi||||c*  -  Si||  <  M2\/pIH  kPk\\ck  ~  Si||  <  \pkHkPk  +  \M]  ||cfe  -  Si||2, 
implying  from  (4.24)  that 

^i(O)  <  -\plHkPk  +  ~  Pi)|ki  -  Si||2. 

From  this  inequality  it  follows  that  if  pk  >  \M%,  condition  (2.11)  will  be  satisfied,  and 
the  penalty  parameter  will  not  be  increased.  Given  that  we  are  using  the  rule  (2.12)  for 
updating  pk,  it  must  hold  that  pk  <  M2.  | 

5.  Other  Merit  Functions 

Several  merit  functions  have  been  proposed  and  analyzed  in  the  literature  (a  review  can 
be  found  in  Powell  [Po87]).  The  question  arises  if  the  convergence  results  using  early 
termination  in  the  solution  of  the  QP  subproblem  depend  on  our  specific  merit  function,  or 
if  they  are  fairly  independent  of  this  choice.  We  shall  show  in  this  section  that  the  choice 
of  merit  function  is  not  critical.  What  we  present  is  how  to  adapt  our  SQP  algorithm  to 
the  use  of  other  merit  functions  rather  than  examine  other  methods  explicitly  to  see  if  the 
particular  QP  subproblem  posed  and  the  manner  the  search  is  performed  can  be  adapted 
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to  the  use  of  an  incomplete  solution.  For  example,  we  still  perform  a  search  in  the  x  and  A 
spaces.  Slack  variables  do  not  appear  in  the  merit  functions  we  shall  consider,  consequently 
the  search  in  the  space  of  the  slack  variables  is  no  longer  required. 

We  have  selected  as  examples  the  study  of  two  particular  merit  functions.  The  first  one 
corresponds  to  a  class  of  merit  functions  that  includes  among  others  the  £\  merit  function 
analyzed  in  Han  [Han76],  Byrd  and  Nocedal  [BN88]  and  Burke  and  Han  [BH89].  This 
general  merit  function  takes  the  form: 

<£(*,  A)  =  F(x)  +  A  Tc~(x)  +  p\\c~{x)\\p,  (5.1) 

where  an  £p  norm  (1  <  p  <  oo)  is  used,  and  c~(x)  =  max(0,  -Cj(x)).  The  second  merit 
function  we  consider  is 


<t>(x,\)  =  F(x)  +  \Tc  (x)  +  ±p\\c  {x)\\\.  (5.2) 

This  merit  function  has  been  studied  among  others  by  Powell  and  Yuan  [PY86]  (applied 
to  the  equality-constrained  problems  only)  and  Schittkowski  [Sch81].  Unlike  either  of  these 
algorithms,  where  the  multiplier  estimate  A  was  treated  as  a  function  of  the  iterate  A(z), 
we  do  not  explicitly  define  the  form  of  the  multiplier  estimates  although  the  ones  used  in 
both  methods  satisfy  the  criteria  MCI,  MC2  and  MC3.  Indeed  the  one  used  in  [PY86] 
also  satisfies  MC3\ 

We  still  assume  A1-A7  hold  for  the  problem.  However,  when  the  merit  function  (5.1) 
is  used,  the  multiplier  estimate  pk  is  only  required  to  satisfy  MCI.  This  condition  is  trivial 
to  satisfy.  For  example,  we  may  choose  A0  =  0  and  pk  =  0  making  the  search  in  the 
multiplier  space  void.  Such  a  choice  reduces  (5.1)  to  the  well-known  £\  merit  function  and 
our  algorithm  becomes  very  similar  to  that  analyzed  in  [Han76].  When  (5.2)  is  used,  we 
assume  conditions  MCI  and  MC2  hold.  We  have  also  assumed  in  the  proofs  that  Ao  >  0 
and  pk  >  0.  We  omit  the  proofs  that  the  iterates  lie  on  a  compact  set.  For  the  first  merit 
function  (5.1)  this  proof  is  relatively  straightforward,  since  it  will  be  shown  that  the  penalty 
parameter  is  bounded.  The  proof  for  the  second  merit  function  (5.2)  is  very  similar  to  that 
for  the  Augmented  Lagrangian  merit  function. 

The  criteria  (2.15)  for  the  choice  of  steplength  assume  the  merit  function  has  contin¬ 
uous  first  derivatives.  This  property  does  not  necessarily  hold  for  the  merit  functions  under 
consideration.  Therefore  we  use  the  following  criteria  for  determining  a  value  a*. 

Define 

Afc  =  glpk  +  (£fc  -  >^k)Tc~(xk)  -  /9jt||c_(ifc)||.  (5.3) 

We  start  by  selecting  a  value  6*,  satisfying 

4>k(otk)  =  4>(xk  +  dfcPfc,  A*  -f  d k£k)  <  4>k( 0)  +  (5.4) 

and  either 

dfc  >  7(  >  0  (5.5) 

or 

ak  >  7«djt  and  4>k{ok)  >  <M°)  +  <7<*k&k,  (5.6) 

where  0  <  7/  <  7„  <  1,  0  <  77  <  a  <  1  and  a*  >  0.  For  a  discussion  of  these  criteria  and 
the  existence  of  d*  see  Calamai  and  More  [CM87]. 
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In  addition  to  these  conditions,  we  also  also  want  to  limit  the  size  of  the  infeasibilities.  If 
<ifc  satisfies  condition  (2.16),  then  we  let  ak  =  ak-  Otherwise,  we  compute  a*  by  performing 
a  backtracking  linesearch  from  ak  until  conditions  (5.4)  and  (2.16)  are  both  satisfied. 

Our  preference  for  the  criteria  given  in  Section  2  is  based  on  our  belief  that  in  practice 
they  lead  to  a  better  choice  of  a*,.  In  the  definition  of  our  algorithm  we  could  have  used 
other  steplength  criteria  without  impacting  the  convergence  properties. 

The  following  basic  relationships  will  be  used  to  establish  the  convergence  results, 

cj (x  +  ap )  <  |cj(a:  +  ap)  -  cj(x)  -  aajp \  -  min(0,  Cj(x )  +  aajp)  (5.7a) 

-  min(0,  Cj(x)  +  aajp)  <  (l-a)cj(a:)  (5.7b) 

-uTAp  <  -||c-(x)j|,  (5.7c) 

-SlAp  <  -c~(x).  (5.7d) 

In  these  inequalities  A  =  Vc(x).  Also,  SI  is  a  diagonal  matrix  such  that  —SlAp  is  an  element 
of  the  subdifferential  of  c~(x  +  ap)  at  a  =  0.  The  diagonal  entries  of  SI  take  values  in  [0, 1], 
are  zero  whenever  cj(x)  >  0  and  take  the  value  one  whenever  Cj(x)  <  0.  Finally,  uT Ap 
represents  an  element  of  d<p(0),  the  subdifferential  of  ip(a)  =  ||c“(a:  -f  ap)jj/  at  0.  The 
elements  of  cj  are  given  by 

=  (!%  (f%)  ’ 

and  have  the  property  that  cjtc(x)  =  — {|c— (x)||/. 

Consider  now  the  case  when  4>  has  been  defined  from  (5.1).  From  our  assumption  that 
A*,  >  0  and  (2.4b), 

XlSlk(AkPk  +  ck)  >  0 

for  all  k.  It  follows  from  this  inequality  and  the  relationships  given  in  (5.7)  that 
4(0)  =  9kPk  +  (kc~{xk)  -  XjSlkAkPk  ~  Pk^kAkPk  <  Afc. 

We  select  pk  such  that 

A/t  <  - \pkHkpk ■  (5.8) 

This  rule  is  analogous  to  the  ones  used  in  Byrd  and  Nocedal  [BN88],  and  Burke  and  Han 
[BH89], 

The  first  step  is  to  establish  that  such  a  value  of  p  exists.  From  (3.13)  and  (5.3)  we  have 
A*  <  -(2  +fa)PkHkPk  +/?2||c*||  -  (&  -  Xk)Tc-k  -  p\\c-k\\.  (5.9) 

If  we  now  use  (2.6),  property  MCI  and  Lemma  2.1  to  bound  the  multiplier  term 

(&  -  *k)Tck  <  II Pk  -  2A*||||c*  ||  <  3^11  ckll, 

we  obtain  in  (5.9) 

A k  <  ~(j  +  0\  )plHkPk  +  (Pz  +  3/?„  -  p)||cfc-||. 

Defining  pu  =  02  +  3/3^,  for  any  value  p  >  pu  condition  (5.8)  is  satisfied  for  any  k.  This 
result  also  shows  that  the  value  of  p  will  remain  bounded  in  the  algorithm. 
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Theorem  5.1.  The  algorithm  modified  to  use  the  merit  function  (5.1)  converges  globally. 

Proof.  Given  the  bound  in  Lemma  3.8,  it  suffices  to  show  that  ||pfc||  — >  0. 

As  p  cannot  grow  without  bound,  any  strategy  for  increasing  p  by  a  finite  quantity 
whenever  it  is  required  to  increase  p  implies  that  there  exists  an  iteration  value  K  such  that 
Pk  =  Pk  for  all  k  >  K.  We  consider  only  iterations  of  this  form.  For  k  >  ft',  from  (5.4), 
(5.8)  and  condition  MC2, 

4>{<*k)  -  <P(ak-\)  <  OlkTJ^k  <  -,nfi>vH(Xk\\pk\\2 ■ 

From  the  boundedness  of  (assumption  A3),  it  follows  that 

OfclNI2  -*•  0.  (5.10) 

If  ||pjt||  — >  0,  convergence  follows  from  Lemma  3.8.  Otherwise,  if  for  a  subsequence 
||pfc||  >  £,  from  (5.10)  we  must  have  — >  0  along  the  subsequence,  and  from  the  termination 
conditions  for  the  linesearch  (5.4),  (5.5)  and  (5.6),  a*  — >  0,  as  the  step  required  to  satisfy 
condition  (2.16)  is  uniformly  bounded  away  from  zero  (see  (3.56)  and  (3.57)).  Finally,  from 
(5.6)  we  must  also  have  oik  0. 

In  the  following  relationships  we  drop  the  subscript  k  corresponding  to  the  iteration 
number,  and  we  denote  by  a  tilde  the  value  of  functions  evaluated  at  x  4-  ap  (i.e.:  c  = 
c(x  +  ap)). 

From  the  definition  of  the  merit  function  (5.1), 

4>(a)  -  0(0)  =  agTp  +  AT(c~  -  c~)  +  d£rc-  -  ap||c~|| 

+  (F  -  F  -  agTp)  +  p(||c"||  -  (1  -  q)||c“||). 

For  the  last  term,  from  (5.7a)  and  (5.7b),  it  follows  that 

lim  -  (1  -  a)l|c“ll  <  ||c  —  c  —  aAp\\, 


and 


cf>{a)  -  <£(0)  <  agTp  4-  \T(c  -c  )  +  a£Tc  -  ap\\c  jj 
4 -  (F  -  F  -  agTp)  4-  p\\c  -  c  -  aAp||. 

If  we  use  again  (5.7a)  and  (5.7b)  on  the  terms  associated  with  the  multiplier  estimates 
(given  that  by  assumption  A  4-  a£  >  0),  and  the  Taylor  series  expansions  for  F  and  c,  we 
obtain 


<f>(a)  -  m  <  (*9TP  +  T,j(*j  4-ofj)|cj  -  Cj  -  aajpl  4-  (1  -  a)ATc 
-  A Tc~  4-  a(l  -  a)£Tc~  -  ap||c“||  4-  0(||ap||2). 


After  simplifying  this  expression  we  have 

4>{a)  -  0(0)  <  a(gTp  4-  (^  -  A)rc"  -  p||c"||)  4-  «2||c“(j  ||(||  4-  0(||dp||2). 


(5.11) 


5.  Other  Merit  Functions 


43 


Replacing  this  bound  in  (5.6)  implies 

0  <  (1  -  +  a2||c-||  11(11  +  OflMI2).  (5.12) 

Since  from  (5.8)  and  condition  HC2,  A  <  -Psvh\\p\\2,  and  we  have  assumed  that  ||p||  >  e, 
it  follows  by  taking  limits  along  the  subsequence  that 

0  <  —(1  —  2. 

However,  this  is  not  possible,  implying  ||pjb||  — *•  0  for  the  whole  sequence.  | 

Consider  now  the  second  merit  function  (5.2).  The  subgradient  along  the  search  direc¬ 
tion  at  (a;jt,Ajt)  is  given  by 

=  9kPk  +  (kc~{xk)  -  XfokAkPk  ~  PkC~(xk)TAkPk  <  At, 


where 

Afc  s  glpk  +  (&  -  Xk)Tc~(xk)  -  pk\\c~(xk)\\2. 

Note  that  A*,  >  0  implies 

(QkXk  +  pkCk)T(AkPk  +  ck)  >  0. 

In  this  case  it  is  not  immediately  evident  that  pk  remains  bounded.  The  convergence 
proof  we  give  is  similar  to  the  one  introduced  in  Section  3.  The  definition  of  p  given  in  that 
section  will  be  preserved,  except  c  -  s  is  replaced  by  c~. 

Theorem  5.2.  The  algorithm  modified  to  use  the  merit  function  (5.2)  converges  globally. 

Proof.  Again,  from  Lemma  3.8  it  is  enough  to  show  that  ||pfc||  — ►  0. 

First  assume  that  p  is  bounded.  The  argument  used  is  similar  to  the  one  in  Theorem  5.1. 
From  (5.4),  (5.8),  condition  MC2  and  the  boundedness  of  4>,  (5.10)  must  hold  also  for  this 
case. 

If  ||Pit||  — *  0,  convergence  follows  from  Lemma  3.8.  Otherwise,  if  for  a  subsequence 
||pfc||  >  ei  from  (5.10)  we  must  have  ak  — >  0,  and  from  condition  (5.6)  and  the  boundedness 
of  the  step  to  satisfy  (2.16),  ak  —*  0. 

From  (5.2),  (5.7a)  and  (5.7b),  we  also  have  (we  again  drop  the  index  k  in  the  following 
relationships,  and  use  a  tilde  to  indicate  values  at  x  +  dp) 

<f>(a)  -  4>(Q)  <  agTp  +  Ar(c_  -  c“)  +  d £Tc_  -  p(a  -  Ad2)||c“||2 

+  p\\c  -  c  -  dAp||(i||c  -  c  -  aAp\\  +  ||(c  +  dAp)“||) 

+  (F  -  F  -  agTp), 

and  again  using  (5.7a)  and  (5.7b)  on  the  terms  associated  with  the  multiplier  estimates,  we 
obtain 


<j>(a)  -  <£(0)  <  d  ( gTp  +  (£  -  A)Tc  -  p\\c  ||2) 

+  d2||c-Jj  (||{||  -(-  £p||c~||)  +  0(||dp||2). 


(5.13) 


Replacing  this  bound  in  (5.6)  implies 

0  <  (1  -  *)aA  +  d2||c-J|  (|K||  +  |p||c-j|)  +  0(||dp||2).  (5.14) 

Since  from  (5.8)  and  condition  HC2,  A  <  -/?Jt;//|jp||2,  and  we  have  assumed  that  ||p||  >  e 
and  p  is  bounded,  it  follows  by  taking  limits  along  the  subsequence  that 

0  <  -(1  -  o)pivI1e2. 

However,  this  is  not  possible,  which  implies  ||pfc||  — ►  0  for  the  whole  sequence. 

Assume  now  that  pk  grows  without  bound.  In  this  case  we  have  that  for  all  iterations 
where  the  value  of  the  penalty  parameter  is  increased 

p*.llcfe(ll  <  Ki  and  PhWvhW2  <  k2. 

The  proof  of  this  result  is  basically  that  of  Lemma  3.9.  From  these  bounds  it  is  possible  to 
show  that  we  must  also  have 

pfc||Pfc||2  <  K 

for  all  k  (the  proof  is  similar  to  the  one  for  Lemma  3.10),  implying  pk  —■ <■  0  and  the  conver¬ 
gence  of  the  algorithm.  | 

6.  Numerical  Results 

In  this  section  we  present  numerical  results  obtained  from  an  implementation  of  our  algo¬ 
rithm.  As  a  first  step  we  have  modified  the  code  NPSOL.  We  have  called  the  modified 
routine  INPSOL.  Apart  from  the  definition  of  the  search  direction  all  other  aspects  of 
INPSOL  are  identical  to  those  of  NPSOL.  A  detailed  description  of  NPSOL  is  given  in  Gill 
et  al.  [GMSW86a],  It  should  be  noted  that  NPSOL  does  not  incorporate  linear  constraints 
into  the  merit  function.  An  initial  point  is  obtained  that  is  feasible  with  respect  to  the  linear 
constraints  and  thereafter  feasibility  is  retained  (by  incorporating  the  linear  constraints  in 
the  QP  subproblem).  On  many  practical  problems  the  feasible  region  with  respect  to  the 
linear  constraints  is  compact.  On  such  problems  this  approach  ensures  assumption  A2  is 
satisfied,  and  assumption  Al  is  implied  by  A3. 

The  purpose  of  the  testing  reported  is  to  demonstrate  that  the  efficiency  and  robustness 
of  the  modified  algorithm  are  comparable  to  those  of  NPSOL.  Naturally,  we  can  only  test  the 
hypothesis  on  the  domain  of  problems  NPSOL  is  designed  to  solve,  namely  problems  having 
a  small  number  of  variables  and  constraints,  although  on  these  problems  the  opportunities 
for  improvement  are  limited,  as  we  discuss  later.  What  this  implementation  really  tests 
is  whether  the  introduction  of  flexibility  in  the  determination  of  the  search  direction  has 
a  significant  cost.  The  parameter  0C  was  set  to  infinity  to  avoid  differences  with  NPSOL 
arising  due  entirely  to  the  linesearch. 

The  search  direction 

The  algorithm  described  in  Section  2  allows  for  considerable  flexibility  of  design.  We  de¬ 
scribe  here  the  specific  choices  made  in  our  implementation.  The  search  direction  pk  is 
computed  according  to  the  following  steps.  (The  subscript  k  is  dropped  from  now  on.) 
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•  An  initial  feasible  point  for  each  QP  subproblem,  po,  is  obtained  following  the  same 
procedure  as  NPSOL.  No  special  effort  was  made  to  satisfy  conditions  (2.18)  since  on 
the  problems  tested  no  failure  was  detected  that  could  be  attributed  to  the  size  of 

IIpoII- 

•  The  active-set  method  used  in  NPSOL  was  terminated  at  p,  the  first  stationary  point. 
The  multipliers  tt  at  p  are  then  computed.  Define  if  as  ffj  =  7rj||aj||. 

•  Let  eM  denote  machine  precision.  If 


Vi,  ffj  > 


(6.1) 


then  p  is  taken  as  the  search  direction. 


•  If  (6.1)  does  not  hold  a  step  that  moves  off  a  subset  of  the  active  constraints  is 
computed.  To  identify  the  set  of  active  constraints  to  be  deleted,  define  = 
minj  rtj,  and  introduce  a  vector  e,  as 


f  IK II  ^  *j  <  10  3*"min, 

\  0  otherwise. 


(6.2) 


•  There  is  also  a  limit  of  50  on  the  maximum  number  of  constraints  to  be  deleted.  If 
(6.2)  is  satisfied  by  more  than  50  active  constraints,  only  the  ones  having  the  smallest 
multipliers  are  deleted.  For  most  problems  this  limit  has  no  effect,  since  the  total 
number  of  constraints  is  less  than  50. 

•  The  direction  d  that  moves  off  the  selected  constraints  is  obtained  as  the  least-length 
solution  of  the  system  Au  =  et  ;  that  is,  we  define 

d  =  y(ATr1e/, 

where  Y  denotes  a  basis  for  the  range-space  of  AT . 

•  We  obtain  the  search  direction  p  from  (2.21),  as 

/  P  +  id  if  IIpII  <  &/pIIp  +  7«f||, 

^  \  p  otherwise, 

where  7  was  defined  as  in  (2.24)  with  7«  =  1010  and  P,ip  =  100  (with  this  value  the 
step  p  4-  yd  is  accepted  in  nearly  all  cases). 

•  Finally,  the  multiplier  estimate  used  to  define  the  linesearch  is  taken  to  be  ir  if  p  =  p. 
Otherwise,  it  is  taken  to  be  the  least-squares  estimate  obtained  from 


AAthl  =  Ag. 
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Test  problems 

The  two  algorithms,  NPSOL  and  INPSOL,  have  been  compared  by  solving  a  collection  of 
114  problems  from  the  literature.  The  problems  have  been  obtained  from  the  following 
sources: 

•  Problem  1  is  the  example  problem  distributed  with  NPSOL;  its  description  can  be 
found  in  [GMSW86a].  Problems  3  and  4  are  slight  reformulations  of  the  same  problem, 
where  the  bounds  —  1  <  £3  <  1  have  been  replaced  by  the  constraint  13  <  1.  Problem 
3  uses  the  starting  point 

(i  2  n  2  1  l  2  _i  _i\ 

V3> 3’ 10>  3’ 3’ 3’ 3’  3>  3/‘ 

•  Descriptions  for  problems  6  and  12-15  can  be  found  in  [MS82],  The  version  of  problem 
6  considered  is  the  one  corresponding  to  a  value  T  =  10.  Problems  12  and  13  start 
from  point  (d)  for  Wright  No.  4  as  indicated  in  the  reference,  while  problems  14  and 
15  start  from  points  (a)  and  (b)  for  Wright  No.  9,  respectively. 

•  A  description  of  the  SQUARE  ROOT  problems  (17-20)  and  of  EXP6  (9)  can  be  found 
in  Fraley  [Fra88]. 

•  Problems  21-30  were  obtained  from  Boggs  and  Tolle  [BT84]. 

•  All  problems  having  names  starting  with  “HS”  are  from  Hock  and  Schittkowski  [HS81]. 

•  Problems  85-95  can  be  found  in  Dembo  [Dem76]. 

All  the  above  problems  have  been  used  in  the  past  to  test  NPSOL.  It  should  be  noted 
that  the  problems  in  this  group  are  small;  the  average  number  of  variables  is  10,  and  the 
average  number  of  constraints  is  6.  Nevertheless,  many  of  these  problems  are  considered 
hard  to  solve.  Moreover,  for  some  of  these  problems  the  assumptions  made  to  establish 
the  convergence  results  fail  to  hold;  for  example,  in  some  cases  the  Jacobian  of  the  active 
NP  constraints  at  x *  is  singular,  or  no  feasible  points  exist  for  some  QP  subproblems.  In 
problem  42  no  feasible  point  exists  for  NP. 

The  algorithms  have  also  been  tested  on  another  group  of  problems. 

•  The  structural  optimization  problems  99-114  are  described  in  Ringertz  [Rin88].  The 
letters  “I”  and  “E”  in  the  problem  name  indicate  if  the  formulation  used  included 
explicitly  the  displacement  variables  (“E”)  or  eliminated  them  in  advance.  Also,  the 
following  number  (10,  25,  36  or  63)  denotes  the  number  of  bars  in  the  truss  considered. 
Finally,  whenever  a  number  is  included  at  the  end  of  the  name  (006,  040  or  060),  the 
initial  point  taken  has  been  modified  to  be  x j  =  6,  40  or  60  respectively. 

These  problems  have  been  introduced  due  to  the  atypical  behavior  of  quasi-Newton  SQP 
algorithms  on  them.  For  this  group,  the  ratio  of  QP  to  nonlinear  iterations  is  large  when 
compared  to  the  size  of  the  problem;  on  the  first  test  set  (problems  1-98)  the  average  ratio 
for  NPSOL  is  2  QP  iterations  per  nonlinear  iteration,  while  on  problems  99-1 14  the  average 
ratio  is  30. 
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The  normal  behavior  of  NPSOL  on  the  first  set  of  test  problems  is  to  require  a  relatively 
large  number  of  QP  iterations  in  the  first  few  nonlinear  iterations.  Typically,  the  number 
of  QP  iterations  declines  exponentially  until  near  a  KKT  point,  when  only  one  iteration  is 
required.  The  STRUC  problems  depart  from  this  “standard”  behavior,  in  the  sense  that  the 
number  of  QP  iterations  declines  much  more  gradually.  (Although  only  one  QP  iteration 
is  required  in  the  end,  most  nonlinear  iterations  require  more.)  This  offers  the  possibility 
of  observing  the  reductions  that  can  be  achieved  by  using  the  early-termination  criterion, 
with  limited  distortion  from  the  asymptotic  behavior  of  NPSOL. 

Finally,  the  problems  in  this  second  group  are  larger  than  the  ones  presented  above; 
the  average  number  of  variables  is  now  55,  and  the  average  number  of  constraints  is  100. 
For  all  the  reasons  mentioned,  this  set  of  problems  provides  a  better  environment  in  which 
to  test  the  ability  of  the  proposed  early-termination  criterion  to  reduce  the  number  of  QP 
iterations. 

Computing  environment 

Version  4.02  of  NPSOL  was  used  in  these  comparisons.  For  this  test  set,  all  parameters 
used  in  the  code  have  been  fixed  at  their  default  values  (see  [GMSW86a]).  No  attempt  was 
made  to  improve  the  results  by  selecting  a  different  set  of  parameters.  It  would  be  difficult 
to  compare  the  relative  effort  to  adjust  input  parameters  for  the  two  algorithms.  The  runs 
were  performed  as  batch  jobs  on  a  DEC  VAXstation  II  with  5  Mb  main  memory.  The 
operating  system  was  VAX/VMS  version  4.5,  and  the  compiler  used  was  VAX  FORTRAN 
version  4.6  with  default  options. 

Results 

The  results  obtained  from  running  both  algorithms  on  the  test  set  are  presented  in  Table  2. 

The  parameters  chosen  to  characterize  the  relative  performance  of  both  algorithms  have 
been:  the  number  of  outer  (nonlinear)  iterations  for  each  problem;  the  number  of  calls  to 
the  routine  computing  the  values  of  the  objective  function,  the  constraint  functions  and 
their  derivatives  (function  evaluations);  the  total  number  of  inner  (QP)  iterations  for  the 
problem  (this  includes  the  number  of  iterations  necessary  to  compute  a  feasible  point);  and 
the  running  (CPU)  time  needed  to  solve  the  problem.  The  results  corresponding  to  both 
algorithms  are  given  as  a  single  entry  in  the  tables,  with  the  figures  separated  by  a  ”/” 
symbol,  in  the  form 

NPSOL  result/INPSOL  result. 

Given  that  most  of  the  problems  are  not  convex,  the  algorithms  may  converge  to  different 
KKT  points.  Three  such  events  occurred.  Another  possible  outcome  is  failure — that  is, 
the  algorithm  terminates  without  finding  a  solution,  because  the  iteration  limit  has  been 
exceeded,  because  no  significant  progress  can  be  made  at  the  current  point  with  respect  to 
the  merit  function,  or  because  the  objective  or  constraint  functions  need  to  be  evaluated  at 
a  point  for  which  they  are  not  defined  in  the  code.  Such  failures  are  indicated  by  “ — ” . 

For  the  set  of  114  problems,  NPSOL  was  able  to  find  a  KKT  point  in  107  cases,  while 
INPSOL  was  able  to  solve  105  problems.  We  should  emphasize  that  only  the  default  value 
of  the  input  parameters  were  used.  Undoubtedly  adjustment  of  the  input  parameters  on  the 
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problems  that  failed  would  have  led  to  more  successes.  The  figures  illustrate  the  reliability 
of  INPSOL. 

Table  1  presents  a  summary  of  the  results  for  the  four  quantities  monitored  in  Table  2. 
The  average  values  have  been  computed  as  the  geometric  means  for  the  ratios  of  the  values 
for  NPSOL  and  for  INPSOL;  that  is,  averages  larger  than  one  indicate  that  the  correspond¬ 
ing  value  for  NPSOL  is  larger  than  the  value  for  INPSOL.  Also,  the  averages  exclude  those 
problems  where  one  of  the  algorithms  failed.  Separate  entries  have  been  provided  for  prob¬ 
lems  1-98  (the  smaller  problems),  and  for  problems  99-114  (the  structural  optimization 
problems). 


Table  1 

Average  Behavior:  NPSOL  vs.  INPSOL 

Problems 


All 

1-98 

99-114 

Nonlinear  iterations 

.988 

.979 

1.044 

Function  evaluations 

.994 

.999 

.963 

QP  iterations 

1.190 

1.112 

1.884 

CPU  time 

1.043 

1.022 

1.200 

We  now  comment  briefly  on  the  implications  of  these  results. 

•  The  early-termination  rule  seems  to  behave  very  well  regarding  the  numbers  of  non¬ 
linear  iterations  and  function  evaluations;  even  if  we  are  now  using  a  search  direction 
of  “worse  quality”  than  in  NPSOL,  the  numbers  are  very  close  for  both  algorithms. 

•  The  number  of  QP  iterations  is  reduced  by  20%  for  the  complete  set.  When  judging 
this  figure  we  must  take  into  account  that  the  problems  are  small,  implying  that 
the  number  of  QP  iterations  required  per  nonlinear  iteration  is  also  small.  (In  fact, 
the  average  value  for  the  test  set  is  5.6  QP  iterations  per  nonlinear  iteration.)  The 
opportunity  for  improvement  is  correspondingly  limited.  Moreover,  both  codes  use  the 
active  set  at  the  solution  of  the  previous  QP  subproblem  as  a  prediction  for  the  correct 
active  set  in  the  current  subproblem,  resulting  in  a  small  number  of  QP  iterations  close 
to  a  KKT  point.  As  a  result,  significant  savings  achieved  by  incomplete  solution  of 
QP  subproblems  in  the  early  iterations  are  masked  by  a  large  number  of  subproblems 
requiring  only  a  few  QP  iterations.  As  an  example,  for  problem  98  the  largest  number 
of  QP  iterations  needed  in  any  nonlinear  iteration  is  reduced  from  57  for  NPSOL  to 
15  for  INPSOL.  This  effect  is  much  less  clear  when  we  look  at  total  numbers  of  QP 
iterations  (244  for  NPSOL  vs.  170  for  INPSOL).  Recall  that  it  is  necessary  in  any 
implementation  to  limit  the  number  of  iterations  taken  to  solve  the  subproblem.  This 
large  reduction  in  the  maximum  number  of  iterations  is  encouraging.  Moreover,  it 
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indicates  that  INPSOL  and  NPSOL  took  quite  different  paths  to  obtain  a  solution 
on  many  of  the  problems.  In  the  light  of  this  fact  the  similarity  of  performance  is 
quite  remarkable.  Finally,  the  early-termination  rule  still  requires  a  feasible  point, 
and  the  feasibility  phase  is  the  same  as  in  NPSOL.  When  this  phase  accounts  for 
most  of  the  total  number  of  iterations,  as  with  the  STRUC  problems,  the  possibility 
of  improvement  is  further  diminished. 

Nonetheless,  it  should  be  noted  that  for  problems  99-114  the  improvement  obtained 
is  significantly  greater  than  20%,  as  the  mean  ratio  is  now  1.88;  in  fact,  when  we  look 
only  at  the  larger  problems,  the  relative  performance  of  INPSOL  improves  markedly. 
This  offers  the  promise  that  for  even  larger  problems  the  results  obtained  may  be 
substantially  better  than  the  values  shown  above. 

•  The  CPU  time  required  by  INPSOL  is  lower  than  the  time  for  NPSOL,  but  by  a 
factor  that  is  much  smaller  than  for  the  number  of  QP  iterations.  This  is  due  not 
only  to  the  fact  that  function  evaluations  can  be  expensive  when  compared  to  the 
effort  to  solve  each  QP  subproblem,  but  also  to  some  details  in  the  implementation 
that  have  been  chosen  to  affect  the  number  of  QP  iterations,  even  at  the  expense 
of  running  time.  For  example,  the  multiplier  estimate  used  for  the  linesearch  (the 
least-squares  multiplier)  is  expensive  to  compute  when  many  constraints  are  deleted 
in  the  last  step,  as  the  factorization  for  the  Jacobian  of  the  active  constraints  must 
be  updated.  There  are  still  options  to  be  explored  that  might  reduce  the  CPU  time 
for  the  modified  algorithm. 


Table  2 
Numerical  Results 


No. 

Problem  name 

Nonlinear 

iterations 

Function 

evaluations 

QP 

iterations 

CPU 
time  (s) 

l 

NPSOL  SAMPLE  PROBLEM 

12/13 

16/18 

45/34 

3.69/3.61 

2 

SINGULAR 

15/15 

16/16 

4/4 

1.03/1.05 

3 

HEXAGON 

15/16 

21/23 

32/29 

4.41/4.41 

4 

HEXAGON  (ALT.  START) 

11/11 

16/14 

35/26 

3.56/3.26 

5 

LC7 

7/9 

9/11 

13/16 

.76/. 95 

6 

ALAN  MANNE’S  PROBLEM 

17/17 

18/18 

40/37 

21.13/21.92 

7 

ROSEN-SUZUKI 

8/8 

11/11 

9/9 

.81 /.81 

8 

QP  PROBLEM 

8/10 

9/11 

23/15 

1.10/1.04 

9 

EXP6 

33/53 

35/57 

38/57 

1.96/3.08 

10 

STEINKE2 

-75 

~/6 

-/14 

— /.87 

11 

NORWAY 

4/6* 

5/7 

34/13 

1 .23/. 65 

12 

MHW4 

10/10 

18/15 

14/12 

1.31/1.25 

13 

MHW9 

30/19* 

56/28 

42/24 

3.71/2.31 

14 

MHW9  INEQUALITY  1 

28/23 

38/28 

59/40 

3.41/2.73 

15 

MHW9  INEQUALITY  2 

41/14* 

58/27 

80/24 

4.83/1.77 

16 

WOPLANT 

25/29 

29/33 

44/35 

6.85/7.17 

17 

SQUARE  ROOT  1 

-7-* 

-/- 

— / — 

— /— 

18 

SQUARE  ROOT  2 

23/23 

36/36 

0/0 

5.01/5.32 

19 

SQUARE  ROOT  3 

6/6 

9/9 

7/7 

.95/. 94 

20 

SQUARE  ROOT  4 

-7-* 

-/- 

-/- 

— / — 

21 

BT1 

11/11 

19/19 

11/11 

.81  /.83 

22 

BT2 

9/9 

14/14 

9/9 

.71 /.70 

23 

BT3 

2/2 

5/5 

2/2 

.19/. 19 

24 

BT4 

12/12 

18/18 

13/13 

.92/. 92 

25 

BT5-HS63 

6/6 

9/9 

8/8 

.58/. 58 

26 

BT6-HS77 

15/15 

21/21 

16/16 

1.52/1.54 

27 

BT7 

31/31 

56/56 

32/32 

3.36/3.43 

28 

BT8 

17/17 

19/19 

17/17 

1.25/1.44 

29 

BT9-HS39 

13/13 

16/16 

14/14 

.95/1.19 

30 

BT10 

8/8 

11/11 

0/0 

.48/. 52 

31 

BT11-HS79 

9/9 

12/12 

10/10 

1.05/1.06 

32 

BT12 

27/27 

57/57 

28/28 

3.04/3.04 

33 

BT13 

32/32 

44/44 

34/34 

2.61/2.62 

34 

POWELL  TRIANGLES 

23/15 

37/16 

36/23 

3.27/2.28 

35 

POWELL  BADLY  SCALED 

12/12 

15/15 

13/13 

.85/. 85 

36 

POWELL  WRIGGLE 

34/32 

69/55 

60/40 

2.77/2.39 

37 

POWELL-MARATOS 

6/6 

7/7 

6/6 

.44/. 44 

38 

HS72 

7/7 

8/8 

8/8 

.69/. 67 

39 

HS73  (CATTLE  FEED) 

4/4 

5/5 

4/4 

.38/. 36 

40 

HS107 

11/11 

18/18 

27/18 

2.77/2.56 

41 

MUKAI-POLAK 

10/10 

16/16 

13/13 

1.08/1.11 

42 

INFEASIBLE  SUBPROBLEM 

-7-* 

-/- 

— /- 

_ / _ 

43 

HS26 

47/47 

64/64 

48/48 

3.39/3.41 

44 

HS32 

2/4 

3/5 

3/5 

.25/. 38 

45 

HS46 

55/55 

58/58 

56/56 

5.26/4.98 

46 

HS51 

2/2 

5/5 

2/2 

.18/. 14 

47 

HS52 

2/2 

5/5 

2/2 

.19/. 16 

48 

HS53 

2/2 

5/5 

2/2 

.19/. 16 

49 

PENALTY1  A 

16/16 

18/19 

77/41 

20.01/16.49 

50 

PENALTYl  B 

6/7 

14/19 

67/32 

14.77/11.77 

51 

PENALTY1  C 

29/15 

85/40 

152/65 

24.35/11.65 

52 

HS13 

22/19 

23/20 

13/10 

1.29/1.22 

53 

HS64 

29/43 

39/62 

47/60 

2.34/3.33 

54 

HS65 

8/9 

10/11 

16/16 

.70/.  78 

55 

HS70 

36/—* 

39/— 

39/-  - 

3.33/— 

56 

HS71 

5/7 

6/9 

9/9 

.53/. 67 

57 

HS74 

10/26 

15/48 

14/28 

1.17/2.68 

*  Failed  to  solve  the  problem. 

*  Converged  to  a  different  minimizer. 
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Table  2  (cont.) 
Numerical  results 


No. 

Problem  name 

Nonlinear 

iterations 

Function 

evaluations 

QP- 

iterations 

CPU 
time  (s) 

58 

HS75 

6/8 

10/11 

7/9 

.72/. 90 

59 

HS78 

10/10 

14/14 

ll/ll 

1.15/1.15 

60 

HS80 

8/8 

10/10 

8/8 

.92/. 92 

61 

HS81 

14/14 

20/20 

15/15 

1.57/1.60 

62 

HS84 

— */4 

-/5 

-/9 

— /  51 

63 

HS85 

17/14 

18/15 

33/20 

4.00/3.12 

64 

HS86  (COLVILLE  1) 

6/7 

8/8 

11/11 

.62/. 64 

65 

HS87  (COLVILLE  6) 

11/8 

18/9 

18/14 

1.63/1.23 

66 

HS93 

12/12 

15/15 

14/14 

1.36/1.38 

67 

HS95 

1/1 

2/2 

1/1 

.15/. 15 

68 

HS96 

1/1 

2/2 

1/1 

■  17/.15 

69 

HS97 

3/3 

6/6 

3/3 

.40/. 41 

70 

HS98 

3/3 

6/6 

8/8 

.43/. 44 

71 

HS99 

23/—* 

44/— 

74/— 

3.99/— 

72 

HS100 

14/14 

29/29 

18/18 

2.07/2.02 

73 

HS104 

18/18 

20/20 

23/23 

3.36/3.37 

74 

HS105 

43/—* 

61/- 

97/— 

27.14/— 

75 

HS108  (HEXAGON) 

24/32 

45/49 

57/87 

6.78/9.36 

76 

HS109 

11/10 

13/11 

25/29 

3.23/3.26 

77 

HS110 

6/6 

9/9 

24/15 

.78/. 69 

78 

HS111 

41/49 

64/75 

44/52 

8.08/9.05 

79 

HS112  (CHEMICAL  EQ.) 

19/—* 

39/— 

54/- 

2.78/— 

80 

HS113 

14/16 

19/23 

38/36 

3.12/3.41 

81 

HS114 

18/16 

19/24 

36/33 

3.81/3.60 

82 

HS117  (COLVILLE  2) 

17/18 

21/27 

96/39 

6.75/5.34 

83 

HS118  (LC  PROBLEM) 

4/4 

6/6 

20/20 

1.35/1.40 

84 

HS119  (COLVILLE  7) 

12/17 

16/19 

41/47 

4.25/5.60 

85 

DEMBO  IB 

281/—* 

437/— 

296/— 

75.46/— 

86 

DEMBO  2-HS83 

4/4 

6/6 

4/4 

.54/. 54 

87 

DEMBO  3 

9/8 

11/9 

37/20 

2.01/1.78 

88 

DEMBO  4A 

19/19 

23/23 

24/24 

3.53/3.31 

89 

DEMBO  4C 

13/13 

15/15 

20/23 

3.10/3.20 

90 

DEMBO  5-HS106 

17/18 

21/24 

30/31 

2.90/3.04 

91 

DEMBO  6-HS116 

36/43 

96/69 

144/248 

21.84/29.65 

92 

DEMBO  7 

19/12 

24/15 

126/68 

15.54/9.82 

93 

DEMBO  8A 

33/42 

85/118 

105/99 

7.52/9.17 

94 

DEMBO  8B 

29/29 

69/71 

88/73 

6.51/6.45 

95 

DEMBO  8C 

25/27 

60/68 

89/65 

6.19/6.06 

96 

OPF 

18/17 

19/18 

53/51 

468.12/456.10 

97 

GBD  EQUILIBRIUM  MOD. 

5/6 

6/7 

37/26 

6.22/6.10 

98 

WEAPON  ASSIGNMENT 

96/73 

98/76 

244/170 

120.78/114.93 

99 

STRUCI10KON 

18/17 

34/30 

65/42 

13.67/11.73 

100 

STRUCEIOKON 

26/29 

49/67 

87/84 

17.68/20.75 

101 

STRUCI10VAN 

23/19 

41  /34 

54/51 

16.30/13.85 

102 

STRUCE10VAN 

— */24 

— /4C 

— /91 

— /19.44 

103 

STRUCI25006 

42/37 

68/62 

147/85 

92.44/80.99 

104 

STRUCE25006 

20/28 

32/36 

178/95 

357.83/260.79 

105 

STRUCI25DAT 

11/12 

19/21 

24/22 

24.75/27.11 

106 

STRUCE25DAT 

52/21 

106/37 

687/65 

647.13/191.44 

107 

STRUCI36DAT 

23/20 

38/34 

59/46 

120.79/108.02 

108 

STRUCE36DAT 

29/30 

53/62 

87/90 

971.16/1021.87 

109 

STRUCI63040 

117/112 

211/202 

6116/3091 

8182.13/7159.03 

110 

STRUCE63040 

375/—* 

794/— 

3545/— 

77286.64/— 

111 

STRUC 163060 

— */98 

— /244 

— /3899 

— /8281.02 

112 

STRUCE63060 

63/115 

150/316 

6675/3407 

25090.15/33228.42 

113 

STRUCI63DAT 

246/136 

354/412 

9043/2060 

12591.61/11424.54 

114 

STRUCE63DAT 

52/72 

86/145 

8049/2858 

41793.84/22740.66 

*  Failed  to  solve  the  problem. 

1  Converged  to  a  different  minimizcr. 
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